Skip to content

Conversation

@zkamvar
Copy link
Contributor

@zkamvar zkamvar commented Aug 4, 2016

This partially addresses #9 by adding a function that will create a table or list of tables as output. @daattali, @kplaney, @thibautjombart, what do you think?

NEW FUNCTIONS

  • tablesFromMatch() generates a tidy table or a list of tables containing the matches
  • getIndexList() switches the weight output of matchEpiData() to index output

The function to generate the table(s) will only output the information used for matching. This way, the user can use it as a reference to filter their own data. When the output is a single tidy table, the information of the group, data set, index, and weight/score are included.

SHINY

I've modified server.R slightly to use the giveWeights = TRUE argument for matchEpiData(). This will allow us to write something to display the weights/scores in the future.

zkamvar added 11 commits July 16, 2016 16:38
I've added a function to return tables from the
match query. It can work in addition or in lieu
of `matchEpiData()`. I have added documentation
for this and also updated the documentation of
`matchEpiData()`.
This will allow future development so that we can
give the user the scores/weights in the table that
is output.
@daattali
Copy link
Collaborator

daattali commented Aug 5, 2016

Nice work! I don't think you need to add all the library(epimatch) to the examples sections, I don't commonly see that in Hadley et al's packages.

Regarding tablesFromMatch: this is excellent. I haven't looked into it for more than a minute, but does it differentiate between which rows are from which dataset? In the shiny app, when there's a table showing records from two different linelists, it says which records are from which dataset

@zkamvar
Copy link
Contributor Author

zkamvar commented Aug 6, 2016

I think I put the library(epimatch) calls in there during a layover because I was being silly. I can take them out if you like.

tablesFromMatch() includes a column the specifies data set as well as their respective indices.

@daattali
Copy link
Collaborator

daattali commented Aug 6, 2016

Oh I see, really nice function. To be nitpicky:

  • are the row names in the output needed? If they are needed, would it not make more sense to have them as a proper variable rather than rownames?
  • I do see now that using collapse=T prints out where each record came from. But when collapse=F, I don't think there's any identifying information?

Other than that , this is awesome

@zkamvar
Copy link
Contributor Author

zkamvar commented Aug 6, 2016

Oh I see, really nice function. To be nitpicky:

Nits are meant to be picked!

are the row names in the output needed? If they are needed, would it not make more sense to have them as a proper variable rather than rownames?

They are not needed. I can easily get rid of them.

I do see now that using collapse=T prints out where each record came from. But when collapse=F, I don't think there's any identifying information?

collapse = FALSE does not identify the origin (except by rownames, which is a no-no). Should I add that as an extra column?

@daattali
Copy link
Collaborator

daattali commented Aug 6, 2016

It seems to me that it would be more consistent behaviour and make sense to
the end user if the dataset of origin on each record was mentioned. Whether
the result is multiple tables separately or collapsed should probably not
change the amount of output information. If that makes sense to you? Just
my initial thought, but obviously you spent a lot more time with these
functions and data so it's your call! But regarding removing the rownames,
that I do feel strongly about if they're not meaningful :)


http://deanattali.com

On 5 August 2016 at 19:01, Zhian N. Kamvar notifications@github.com wrote:

Oh I see, really nice function. To be nitpicky:

Nits are meant to be picked!

are the row names in the output needed? If they are needed, would it not
make more sense to have them as a proper variable rather than rownames?

They are not needed. I can easily get rid of them.

I do see now that using collapse=T prints out where each record came from.
But when collapse=F, I don't think there's any identifying information?

collapse = FALSE does not identify the origin (except by rownames, which
is a no-no). Should I add that as an extra column?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA6IFHP5dpBnilaCrgukZwAbnYuK9n5sks5qc-sWgaJpZM4JdJBM
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants