Thomson Reuters Launches Bitcoin Sentiment Data Feed for Trading and Risk Management
4 stars based on
48 reviews
Bitcoin pricing prediction and trading simulation through time series and sentiment analysis. The volume of Bitcoin transactions has increased a lot in the last few months, bringing a lot of interest around this crypto-currency. We propose to build a framework capable of predicting the evolution of Bitcoin market and simulating different trading strategies.
The main difference between a Bitcoin market and current stock exchanges is its high volatility and that transactions are not instantaneous it can take up to 5 minutes for a transaction to complete. The main goal of time series analysis in all fields, not only financial markets is forecasting. Once the model will be properly chosen and trained, we will basically re-train the system with these new values on a frequency we haven't defined yet. In order to maintain scalability, we will automatically prune the oldest transaction such that data still remain of manageable size.
For the live crawling of bitcoin transaction, we already came up with a Scala program that automatically crawl new transactions transactions number can vary, but on average we expect less than 1 transaction per second.
In order to enhance our prediction we plan to use sentiment analysis. The aim is to determine the attitude or polarity of a document. We plan to use natural language processing, text analysis applied on Tweets, that are currently very active about the mood of the Bitcoin market. Sentiment analysis applications usually compute a polarity score from Scalability is not a big concern here because we will basically need only 2 bytes Short for the day column which can cover up to 60' days and only 1 byte char for the polarity scaled between and For each month, we can get all the tweets which weight about 30 to 50 GB on average.
We intend to download these files, upload them one by one on the cluster, and preprocess them remove any line that doesn't talk about Bitcoin. For this task, we are going to use a really simple Bitcoin twitter sentiment stocks job. As sentiment analysis framework are already widely deployed, we will first try to use some open source sentiment algorithm implementations e. If these frameworks do not give good results either in terms of accuracy or running timewe could try implementing our own version of a model.
There is no doubt a correlation between the "mood" bitcoin twitter sentiment stocks the documents and the prices, but we would like to show that it is not only the market driving the news as an example, a drop bitcoin twitter sentiment stocks price would generate negative coverage in the medias afterwards.
In order to predict the trend of the market's mood, we're going to bitcoin twitter sentiment stocks use time-series models on sentiment analysis. A web front-end showing real-time Bitcoin exchange trends in a graph that will illustrate our predictions and current market mood.
It would also allow to visualize the gain during a chosen time span according to a fictive start investment. Several data visualization tools are accessible on the web for free D3Visual. We will need to crawl 2 different kinds of data: Bitcoin transaction data, and news website data.
Crawlers will be stored and launched from bitcoin twitter sentiment stocks cluster provided by the course staff. We will have GB of storage, which is more than enough for our project.
The use of sentiment analysis in this field is more recent, but has already been studied for more than 10 years. It has shown positive results on conventional financial markets, thus we expect it to be working on cryptocurrencies. We found a complete containing bitcoin exchanges data since August till now. This file weights bitcoin twitter sentiment stocks than MB.
Twitter4J is a powerful library, we use it in bitcoin twitter sentiment stocks Scala crawler and each time a new tweet is created, we can see it. Of course, filters author, hashtags, etc. As explained, we can find huge database for at least 1 year of tweets on Archive. Jonathan Cheseaux Team leader: Implementation of bitcoin twitter sentiment stocks time series predicting models, algorithm optimization, testing.
Marzell Camenzind time series resp: Bitcoin transaction data fetching, models backtesting - quantitative analysis - bitcoin twitter sentiment stocks architecture responsible. Ressources We will need to crawl 2 different kinds of data: Participants and tasks assignement: Web front-end Graph Fabien Schmitt: Implementation of the time series predicting models, algorithm optimization, testing Igor Vokatch: Data visualization Graph Marzell Camenzind time series resp: The authors are responsible for their content.