XAUUSD is a symbol used in Forex trading to indicate the number of US dollars needed to buy one ounce of gold.
-
Other abbreviations such as EURUSD indicate the exchange rate of national currency pairs while XAUUSD shows comparison between the price of the precious metal and the rate of the US dollar.
-
It is possible to buy gold as a physical commodity at banks or from dealers.
-
Gold is often used by governments that have a large gold reserve to protect the value of their currency; that’s why it is traded on the Forex market.
-
I chose XAUUSD due to high volume of trading together with an extensive amount of historical and real-time data available.
-
It is also not specific to any one nation, economy, or business; rather, it is globally essential.
-
It is more intriguing examining how the price of gold responds to an array of events, including inflation, interest rates, wars, and financial crises.
Objectives
-
When I was implementing this project what I had in mind was a simple Extraction,Transformation and Load pipeline.
-
I am using TwelveData forex API for Extraction of raw data from the financial markets in JSON format.
-
After Extraction we now begin the Transformation of the raw data.This is done to get insight providing a clean dataset for analysis and alerts.
-
Finally Loading of the data involves the storage of the transformed dataset into a database for presentation or more processing purposes.
Step by Step Build:
Get an API_KEY after setting up an account in TwelveData, don’t worry there is a free plan option.
.gitignore file:
.env
.env file:
API_KEY='xxxxxxxxxxxxxxxxxxxxxx'
-
Add the API_KEY to your .env file – remember to add the .env file in your .gitignore file.
-
A .gitignore file specifies intentionally untracked files that Git should ignore.
-
In case you are using version control, this will prevent the pushing of the sensitive API_KEY to GitHub.
extract.py:
Here, extraction of the raw data took place via an API.
import requests
import os
from dotenv import load_dotenv
load_dotenv()
GOLD_API_KEY = os.getenv('API_KEY')
def fetch_xauusd_data():
url = f"https://api.twelvedata.com/time_series?symbol=XAU/USD&interval=5min&apikey={GOLD_API_KEY}"
return requests.get(url).json()
import requests
-
Requests is a python HTTP library.It is commonly used for interacting with web services, APIs and web scraping.
-
In this context we are using it to send GET requests to retrieve data.
import os
- The OS module provides functions used to interact with operating system. The module here is used to work with files and directories.
from dotenv import load_dotenv
load_dotenv()
GOLD_API_KEY = os.getenv('API_KEY')
- This assigns the value of the environmental variable “API_KEY” to the variable GOLD_API_KEY.
def fetch_xauusd_data():
url = f"https://api.twelvedata.com/time_series?symbol=XAU/USD&interval=5min&apikey={GOLD_API_KEY}"
return requests.get(url).json()
-
This function connects to a service that provides financial data, TwelveData API.
-
It is requesting five-minute intervals of XAUUSD commodity data.
-
The request is authenticated by the API_KEY stored in the GOLD_API_KEY variable.
-
The response is given back in JSON format with a similar structure to a Python dictionary.
transform.py:
def transform_data(df):
df['close'] = df['close'].astype(float)
df['SMA1O'] = df['close'].rolling(window=10).mean()
df['candle'] = df.apply(lambda row: 'Bullish' if row['close'] < row['SMA1O'] else 'Bearish', axis=1)
return df
df['close'] = df['close'].astype(float)
- The API sends the data as string text.
- This line of code converts the ‘close’ column in the response to float.
- This will enable one to perform calculations like averages on the data in the column.
df['SMA1O'] = df['close'].rolling(window=10).mean()
- This line creates a new column called ‘SMA1O’
- It computes the average of the last 10 ‘close’ prices for every row.
df['candle'] = df.apply(lambda row: 'Bullish' if row['close'] < row['SMA1O'] else 'Bearish', axis=1)
-
This line adds a new column called ‘candle’ that labels the market condition based on a simple trend rule:
-
If the closing price is above the 10-period moving average (SMA10) → label it ‘Bullish’.
-
If the closing price is below the SMA10 → label it ‘Bearish’.
load.py:
import sqlite3
def load_data(df, db_name="xauusd_data.db"):
conn = sqlite3.connect(db_name)
df.to_sql('xauusd_prices',conn, if_exists="append", index=False)
conn.close()
import sqlite3
SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.
def load_data(df, db_name="xauusd_data.db"):
conn = sqlite3.connect(db_name)
df.to_sql('xauusd_prices',conn, if_exists="append", index=False)
conn.close()
- The load_data function takes two inputs, the transformed DataFrame,df and optional name of sqlite3 database file.
- After connecting to the SQLite file the DataFrame is saved to the table.
if_exists="append"
– means new rows will be added to the table not overwrite it.index='False'
– don’t include Index column in the DataFrame.
Now to combine these scripts into one – *main.py *
from extract import fetch_xauusd_data
from transform import transform_data
from load import load_data
import pandas as pd
def run_etl():
print("Starting ETL pipeline...")
print("Starting data extraction...")
raw_data = fetch_xauusd_data()
if 'values' not in raw_data:
print("No data found, Check api configuration or usage!")
return
print("Starting data transformation...")
df = pd.DataFrame(raw_data['values'])
df = transform_data(df)
print("Starting data loading...")
load_data(df)
print("ETL pipeline finished successfully!")
if __name__ == "__main__":
run_etl()
- I combined all of this into a single main.py file. The entire ETL pipeline is coordinated by:
- Retrieving pricing information for gold (XAU/USD) via an API
- Converting it to label candles and compute moving averages using Pandas
- The output is saved in a SQLite database.