Counting by various teams — often named crosstab reviews — can be a valuable way to appear at knowledge ranging from public feeling surveys to clinical tests. For case in point, how did persons vote by gender and age group? How quite a few program developers who use the two R and Python are men vs. women of all ages?

There are a good deal of means to do this variety of counting by types in R. Here, I’d like to share some of my favorites.

For the demos in this posting, I’ll use a subset of the Stack Overflow Developers survey, which surveys developers on dozens of topics ranging from salaries to technologies made use of. I’ll whittle it down with columns for languages made use of, gender, and if they code as a interest. I also added my personal LanguageGroup column for regardless of whether a developer noted working with R, Python, the two, or neither.

If you’d like to stick to along, the very last webpage of this posting has recommendations on how to down load and wrangle the knowledge to get the very same knowledge established I’m working with.

The knowledge has one row for each and every survey response, and the four columns are all figures.

str(mydata)
'data.frame':83379 obs. of  4 variables:
 $ Gender            : chr  "Guy" "Guy" "Guy" "Guy" ...
 $ LanguageWorkedWith: chr  "HTML/CSSJavaJavaScriptPython" "C++HTML/CSSPython" "HTML/CSS" "CC++C#PythonSQL" ...
 $ Hobbyist          : chr  "Yes" "No" "Yes" "No" ...
 $ LanguageGroup     : chr  "Python" "Python" "Neither" "Python" ...

I filtered the uncooked knowledge to make the crosstabs a lot more workable, together with eliminating missing values and getting the two biggest genders only, Guy and Lady.