Our block-chain research group have begun work on an experimental project to scrape Bitcoin block-chain, understand the data stored and learn how to facilitate recording of new information there in the future. This week, SPC commissioned a dedicated multicore server to begin the work of unpacking and mapping the millions of Bitcoin transactions that have already been recorded.
We are using an 8 core 3ghz Xeon server with 32gb ram, running linux and Spark Apache. Following Jeffrey Thompson’s excellent example, the script loads Bitcoin blocks into Hadoop for exploratory analysis.
These extraordinary images are the first rendered outputs we have from the script, showing the stack of bitcoin block-heights, relative to one another – as of December 2015.
Using iPython and Jupyter notebook to step through, it operates on an off-line Bitcoin block-chain, reporting on each block, collecting some vital statistics and helping us improve on information extraction procedures.
Next step is to process the binary hash arrays we have already extracted which make up each block, into a more malleable database format. The resilient distributed dataset is a logical collection of data which can be partitioned across machines and optimised for massively parallel access to ‘big data’ librarys.
Block-chain mapping will help with our ongoing work to reward peer to peer drive sharing, first described in August 2015 as Shards of Sharing. Later this year we still hope to have your torrent wallet tracker ready for use, but until then, the mapping work goes on!