R

Neighbors-based prediction of physical function after total knee arthroplasty

The purpose of this study was to develop and test personalized predictions for functional recovery after Total Knee Arthroplasty (TKA) surgery, using a novel neighbors-based prediction approach. We used data from 397 patients with TKA to develop the …

Generating publication ready tables with {gtsummary} and {flextable}

Publication-ready tables with {gtsummary} and {flextable} So, I’ve been using R for a while now and I haven’t really found a good solution in terms of producing publication-ready tables that I can customize easily until recently. I thank the developers of {gtsummary} and {flextable} for saving me the trouble of trying to figure out how to customize the back-end of the table generated R into microsoft word document. The purpose of this post is to just share a few tips on how I would create a publication-ready table using {gtsummary} and {flextable}.

tidymodels vs mlr3

Comparison of packages for prediction: {tidymodels} vs. {mlr3} So I wrote a lot of prediction code for my dissertation using {caret} and I wasn’t interested in learning other frameworks… just because I haven’t had the need to do so. But I’ve learned so much from dabbling around with the {caret} package and I think people that are interested in doing prediction related work will benefit from trying out different packages so I’m writing this post for those interested in looking at what the whole process looks like when using {tidymodels} or {mlr3}.

Sample Size Calculation with R Extended

Sample Size Caluclation in R… Extended! This is an extension from a post I saw here. When thinking about study designs for a trial that we want to conduct for a client, one of the things that we need to think about is how many patients we would need to recruit to see the effect that we propose to see? We’re going to walk through a couple of scenarios and see how we need can use R to get at the sample size & power question when desgining our studies.

Tidy evaluation

Tidy evaluation… extended! This is an extension from a post I saw here. There are several instances when I want to be able to use unquoted variable names in a function to generate ouputs without having to quote the variable names. There are also instances when I need to use the quoted variable names. We’ll walk through some examples and how we need to set up code to do that.

Timeline Graph

Timeline graphs… extended! So, I was checking a few online forums and I found someone asking about replicating a timeline graph and I got super interested in doing so because sometimes clients may want to visualize milestones in a way that isn’t tabular. A few resources that I came across are from Ben Alex Keen’s Site and Stack Exchange for identifying preservation of dates using ifelse() and if() function and lastly Data Nova.

{caret} custom function implementation

Problem: Optimal Probability Threshold It’s been a while! I’m writing this post for a couple of reasons. I don’t want to only be writing my dissertation… Someone found me on github and asked me to help them. I am happy to be writing this post for the two reasons listed above so let’s jump straight into the problem. Recently, a random stranger e-mailed me about a problem they were having at work.

Tidy simulation!

Simulations? But… why? When I was taking classess at University of Colorado Anschutz Medical Campus, I was one of those guys that was strong in programming/coding but had to spend much of my time on the math behind statistics. I stopped taking math after vector calculus during undergrad (which I regret) because I was a Biochem major. I thoroughly enjoyed learning all the tricks but I so wish I had learned about (real) analysis and higher (proof-based) maths when I was younger.

Performance of my stock vs. cpi

Intro As a new years resolution, I made a promise to myself to be more conscious about the investments that I’m making. Of the few that I have, I’ve decided to look into the historical performance of one of the funds that I have. I want to compare the stock price increase over time with the overall consumer price for products in the U.S. (using the CPI-U). Using quantmod to see performance quantmod is an R package that makes it easy for users to track their stock.

Replacing missing values in R

Problem So I’m a regular visitor of the r/rstats subreddit and recently there was a post about replacing missing values using a certain logic. Specifically here is the problem: Below is a dataframe where each row represents a city and the idea is to fill the missing unemployment rate. The OP (original poster) wanted to fill in the NA columns based on the mean value of the unemployment rate of the same State and Size.