So recently, a new study came out about the "gay gene" and like usual, most of it said it's a complex issue that we haven't thoroughly investigated, that sexuality isn't 100% environmental or 100% genetic, blah blah blah, the same old thing. Source .
The cool thing is that this time, they actually bothered to try and see if they could predict someone's sexuality. Now, remember a few weeks ago. when I brought up A.I. rights and machine learning? Well it's a good thing you all had a crash course, because they actually used a machine learning algorithm called FuzzyForest to develop some sort of "gaydar".
If you need a refresher (and I KNOW I'll get corrected on this, so I apologize in advance for any inaccuracy and preemptively agree with any corrections that are actually correct) machine learning algorithms are basically programs that let a computer program itself with a purpose in mind. In this case, they gave the computer a list of data for 47 pairs of twin, with data for the person's sexuality and 400,000 epigenetic markers (which "latch onto DNA and help turn genes on or off"). The computer basically took that data and made "hypotheses". These "hypotheses" start off simple and inaccurate like, "if marker 77772 is X then the person is not straight". However, some are more accurate than others, so it builds off those to develop more complex and more accurate "hypotheses" like "if X applies to markers 22 and 10 but not 12 and 14 but only while 55 unless 15 and 16 and 15 but not 33 (unless of course 99 and 199)… ect."
Again, I'm probably wrong, so listen to whoever corrects this post.
Anyway, they apparently managed to get up to a 70% success rate. That figure could be a little optimistic, depending on a lot of factors that aren't actually violations in ethics but still could have been used to put the predictions in a favorable light. However, from what I can gather, it seems it shouldn't be too terribly far from that, since most of the criticisms of the validity come down to the sample size, and every single statistics class I've taken has said something along the lines of an accurate testing method is almost always more important to getting an accurate results, rather than how large the sample size is. (For example, if you could choose to flip a coin with a heads to tails ratio of 1:1 a total of 20 times or a coin with a heads to tails ratio of 501:499 a total of 10000 times, which would you choose if you wanted to get as close to having a 1:1 ratio as possible?)
Sorry, I'll get back on track. Anyway, this indicates that there are no glaring fatal flaws to the study, so if they were to double check with a larger sample size, it wouldn't be irrational to predict the accuracy would be approximately just as high, unless there was some variable that everyone overlooked, like they chose everyone from the same town, or something like that.
And while a 70% success rate doesn't seem high, since it's only 20% better than guessing, or because if you guessed everyone was straight you'd most likely be more accurate than that, I'd like to point out a few things. First, humans aren't spread out evenly across sexual orientations, so a random sample of the population wouldn't give you a fair test; you'd have to guess the sexual orientation of a number of people that had equal parts straight and gay (or straight and gay and bi and asexual). Secondly, the only information the computer had was from the epigenetic markers, and I don't know you, but my gaydar isn't even 70% accurate, and a whole bunch of 0s and 1s related to said markers or a sample of a person's spit would not improve my accuracy at all.
Sorry this was so long, it's just that I really like this study, because it ties together sexuality, math, machine learning, it's the first (that I know of) that was able to make any sort of prediction, and it gives more evidence towards the epigenic theory of homosexuality (aka, it's your mom's womb which plays the biggest part in your sexuality, which while is an environmental factor, is not something a developing fetus has any sort of choice or control over).
Again, sorry in advance for any inaccuracy. I tried my best, but I'm not perfect.