finance data source
Table of contents
finnlp github shows a list of sources to read from.
transcript
https://discountingcashflows.com/api/transcript/AAPL/q1/2024/
but it seems always return the latest transcript.
fmp
price target and news can't be got for free account.
the cheapest is $19/month, subscription yearly. Does news data belong to Market data?
Conclusion: give up for now.
marketaux
can get 3 news per request example in fingpt
get news from market aux, then call chatgpt to filter.
But after randomly looking at the news, it is hard to find really useful info. Most news are not really news, they are just repeating similar info, like it's time to buy the stock.
Maybe I haven't found the right way to use the news.
source_ids, many I don't know.
example:
def get_news_from_market_aux(api_key: str, data_path: str = 'finance_news_from_market_aux.txt'):
limit_line = 4
if os.path.exists(data_path):
with open(data_path, 'r') as f:
data = json.load(f)
else:
conn = http.client.HTTPSConnection('api.marketaux.com')
params = urllib.parse.urlencode({
'api_token': api_key,
"found": 8,
"returned": 3,
"limit": limit_line,
"symbols": "SMCI",
"min_match_score": 80,
"page": 1,
"source_id": "forbes.com-1",
"domain": "forbes.com",
"language": "en",
})
conn.request('GET', '/v1/news/all?{}'.format(params))
data = conn.getresponse()
data = data.read().decode('utf-8')
data = json.loads(data)
with open(data_path, 'w') as f:
f.write(json.dumps(data, indent=2))
assert isinstance(data, dict)
'''concert dict to string (Title: ... Content: ...)'''
max_num_news = 8
max_len_title = 32 * 4
max_len_content = 0 * 4
data = data['data']
data_str = ""
for item in data[:max_num_news]:
title = item['title'][:max_len_title]
content = item['description'][:max_len_content]
data_str += f"{title}, {content}\n"
return data_str
polygon
use polygon to download historical flatfiles data.
But the cheapest subscription doesn't have news? give up.