finance data source

·

2 min read

finnlp github shows a list of sources to read from.

transcript

https://discountingcashflows.com/api/transcript/AAPL/q1/2024/

but it seems always return the latest transcript.

fmp

fingpt fmp colab example

price target and news can't be got for free account.

the cheapest is $19/month, subscription yearly. Does news data belong to Market data?
Conclusion: give up for now.

marketaux

documentation

can get 3 news per request example in fingpt

get news from market aux, then call chatgpt to filter.

But after randomly looking at the news, it is hard to find really useful info. Most news are not really news, they are just repeating similar info, like it's time to buy the stock.

Maybe I haven't found the right way to use the news.

source_ids, many I don't know.

example:

def get_news_from_market_aux(api_key: str, data_path: str = 'finance_news_from_market_aux.txt'):
    limit_line = 4

    if os.path.exists(data_path):
        with open(data_path, 'r') as f:
            data = json.load(f)
    else:
        conn = http.client.HTTPSConnection('api.marketaux.com')

        params = urllib.parse.urlencode({
            'api_token': api_key,
            "found": 8,
            "returned": 3,
            "limit": limit_line,
            "symbols": "SMCI",
            "min_match_score": 80,
            "page": 1,
            "source_id": "forbes.com-1",
            "domain": "forbes.com",
            "language": "en",
        })

        conn.request('GET', '/v1/news/all?{}'.format(params))

        data = conn.getresponse()
        data = data.read().decode('utf-8')
        data = json.loads(data)
        with open(data_path, 'w') as f:
            f.write(json.dumps(data, indent=2))

    assert isinstance(data, dict)

    '''concert dict to string (Title: ... Content: ...)'''
    max_num_news = 8
    max_len_title = 32 * 4
    max_len_content = 0 * 4

    data = data['data']

    data_str = ""
    for item in data[:max_num_news]:
        title = item['title'][:max_len_title]
        content = item['description'][:max_len_content]
        data_str += f"{title}, {content}\n"
    return data_str

polygon

use polygon to download historical flatfiles data.

But the cheapest subscription doesn't have news? give up.