About a month ago /u/bitcoin_error_log [offered 5btc for an in-depth analysis](http://ift.tt/1B6glZu) to confirm manipulation in the block size debate. My full analysis is available **[here in pdf format](http://ift.tt/1MB8AMl)**. It confirms that there is evidence of manipulation in the blocksize debate, and even on other threads on /r/Bitcoin. The analysis looks at all pairwise comparisons of user accounts. It uses topic modeling, nlp and machine learning to classify which pairs of users are likely to be sock puppets. A couple charts of the topics discovered in /r/Bitcoin comments are available here: http://ift.tt/1Ibc2h9. The resulting model was able to detect multi-account use with up to 85% accuracy. The results showed that several accounts were seemingly created just for trolling bitcoin experts, especially /u/petertodd. There were also few deleted accounts in the results and several throwaway accounts. I didn't limit the analysis just to threads relevant to the blocksize debate, so the model was able to find sock puppets on other threads as well. One example of this was evidence of Roger Ver - /u/MemoryDealers - **possibly** using a sock account on threads relevant to the recent scandal with /u/Okcoinbtc. There are however, **quite a few False positive results, so please bear this in mind before drawing conclusions from the rankings.** The rankings for probable pairs of sockpuppets are available [here](http://ift.tt/1MB8vbB) - this lists the top scoring pairs of users who are suspected of having sock accounts. A dictionary of suspects of multiaccount use available [here](http://ift.tt/1Ibc0ph) - the dictionary key is a user, and the values are the possible multiple account. The entire project code is available on my github [here](http://ift.tt/1MB8AMn). The database of comments is available [here](http://ift.tt/1Ibc2xn) and I'm in the process of uploading the dataframes generated by the code. --------------------- Other users who responded to the bounty did not dig deep enough into the data and were not able to find evidence of socks. An analysis of this scale takes **a lot** of time and computational resources. If /u/bitcoin_error_log and Mike Belshe ( /u/bitgo_ben ?) deem the report credible enough for the bounty, my BTC address is 1NDXgokkrWxwisQpGNb9hWwHpE43RQqcSk Cheers --------------------- P.S. **I mentioned in the report that similar techniques can be used to determine the identity of Satoshi (assuming he's still active in the community). I suspect that would be an easier problem to solve than detecting socks.** **I'm a data scientist by trade and I'm always looking for new projects to work on. If you are interested in collaborating feel free to PM me or contact me on my website.** www.andrehaynes.me
link: http://ift.tt/1Ibs2Br