By Ashish Rajendra Sai
Article: “Reverse engineering the blockchain as illustrated using eigen decomposition.”
Authors: Farshad Ghassemi Toosi; Jim Buckley; Ashish Rajendra Sai ; Andrew Le Gear
Affiliation: Irish Software Research Centre & Horizon Globex
Article Category: Fraud detection
Why this article? The argument that bitcoin has no intrinsic value has been thrown around in academic circles since the very start. It was not until a Dark Net website started trading in Bitcoin that researchers began paying attention to this technology. This attention was not because of the disruptive nature of the technology but due to the illicit use. It became apparent that the pseudo-anonymity of Bitcoin was prone to be used in illicit activities. Detecting such actions on the Blockchain has been of prime interest to researchers since the start of the field.
In this article, we focus on using reverse engineering to identify fraud on the Blockchain. We have conducted an investigation into the potential that reverse engineering has in fraud detection. We have successfully published two peer-reviewed articles on the use of reverse engineering in fraud detection. In this week’s installment, we look at the first of these two papers.
Paper Overview:
Background:
Software reverse engineering is the generation of abstracted views of large software systems from detailed implementation artifacts. As such, reverse engineers work towards views obfuscated by the scale, and complexity of those artifacts. In a similar vein, software reverse engineering, provides a wealth of techniques that are of potential use if applied to the Blockchain. The work of our group builds on existing research by investigating and leveraging approaches from the field of software reverse engineering, towards aggregated, insightful views of Blockchain transactions.
To demonstrate the potential, we chose one such technique — Eigen Decomposition — and apply it to an extract of transaction records on the Ethereum Blockchain. The raw Blockchain only shows relationships between wallets and individual contracts and gives no abstract view aggregating wallets — by organisation, for example. We show how sensible groupings of wallets and contracts, otherwise opaque to a viewer of the Ethereum Blockchain, can be derived.
Methods:
Our implementation of Eigen Decomposition, is a binary matrix (P), N × M where N is the total number of items (in this case wallets) and M is the total number of attributes (in this case contracts). The entries of such a matrix represent the association of each item to each attribute. For example, an entry 1 in the matrix indicates the use of a specific contract, indicated by the column index, by a specific wallet, as indicated by the row index.
Results:
To illustrate this approach, a 300 block snapshot (approximately 1 hour given average Ethereum confirmation times) from block 5159642 to 5159942 was isolated and analysed. Applying an Eigen Decomposition analysis with a threshold of 4 common contracts produces 2 distinct multi-wallet groups of size 35 and 3 respectively. For simplicity of illustration, we examine the smaller, second group here.
At first glance the correlation between wallet and contract seems likely: It could be that these are very popular tokens and therefore have many unrelated trading accounts chasing that equity. But this only holds to scrutiny if this was an intersection of sets and not an exact correlation. However, a further manual investigation into the transaction history of these accounts reveals that they have the exact same transaction histories, interacting with the same contracts for the same amount. This cannot be the result of coincidence and we must conclude that these accounts are tightly coupled, probably owned by the same individual or company. While this subsequent analysis was manually performed in this instance, there is no reason why the analysis could not be automated going forward.
Implications for the greater blockchain community:
This article provides a fascinating insight into the world of software reverse engineering. The research suggests that these techniques from classical software engineering can prove to be of significant value to the blockchain world.
We ask the blockchain community if new blockchain systems should be designed to detect frauds without the need for extensive manual data analysis?
Check in each Wednesday for digestible insights surrounding some of the most influential research publications in the crypto/blockchain domain.