SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate
server process. Applications can use SQLite for internal data storage. It’s also possible to prototype
an application using SQLite and then later port the application to a production database system.
SQLite databases are stored in a file on disk (usually with a .db extension). If you attempt to connect
to a database file that doesn’t exist, SQLite with create a new database, assign it the name you passed
to the connect
function and save it to your current working directory.
Typical sqlite setup and usage in Python is as follows:
- Create a connection object
- Initialize a database cursor
- Construct a query for the target dataset
- Pass the query string to the cursor’s
execute
method - Iterate over the cursor’s result set
import sqlite3
db = sqlite3.connect("filename.db")
cursor = db.cursor()
SQL = "SELECT * FROM TABLE"
cursor.execute(SQL)
# Iterate over cursor and print queried records.
for record in cursor:
print(record)
The result will be a list of tuples, so data elements can be accessed by row or selectively by referencing components by index offset. The Pandas library simplifies much of the work interacting with sqlite databases, but for the purposes of this article we’ll work soley within the standard library.
Creating Tables
If the database file passed to the sqlite3.connect method doesn’t exist, a new database with the name specified will be created. The following example creates a database consisting of 2 tables: The first table holds closing stock prices, the second contains a mapping between ticker symbols and company names.
"""
Creating a new database with two tables using sqlite3.
=============================
Table 1 |
=============================
TABLENAME: `CLOSING_PRICES` |
|
FIELDS : DATE TEXT |
TICKER TEXT |
CLOSE REAL |
=============================
=============================
Table 2 |
=============================
TABLENAME: `TICKER_MAPPING` |
|
FIELDS : TICKER TEXT|
COMPANY NAME TEXT|
=============================
"""
import sqlite3
# Create new database sample.db. Notice sample.db is now
# listed in your working directory.
db = sqlite3.connect("sample.db")
# Initiate cursor object.
cursor = db.cursor()
# Specify the DDL to create the two tables.
tbl1_ddl = """CREATE TABLE CLOSING_PRICES (
DATE TEXT,
TICKER TEXT,
CLOSE REAL)"""
tbl2_ddl = """CREATE TABLE TICKER_MAPPING (
TICKER TEXT,
COMPANY_NAME TEXT)"""
# Call the cursor.execute method, passing tbl1_ddl and tbl2_ddl as arguments.
cursor.execute(tbl1_ddl)
cursor.execute(tbl2_ddl)
# IMPORTANT: Be sure to commit changes you want persisted. Without
# commiting, changes will not be saved.
db.commit()
# Close connection to sample.db.
db.close()
To verify that the tables have been created, run the following:
db = sqlite3.connect('sample.db')
cursor = db.cursor()
cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
print(cursor.fetchall())
db.close()
Loading Data in SQLite
The following example demonstrates two methods of loading data into SQLite tables. The first method assumes the data is already available within the current Python session. The second method assumes data is being loaded from a delimited data file. For the second example, refer to ticker_data.csv, available here.
"""
==========================================================
Method #1: Data already avialable in Python session |
==========================================================
Insert four records into CLOSING_PRICES table based on the closing
prices of AXP, GE, GS & UTX on 7.22.2016.
"""
# Reestablish connection to `sample.db` database.
db = sqlite3.connect("sample.db")
cursor = db.cursor()
# Single records can be inserted using the `cursor.execute` method.
cursor.execute("INSERT INTO TICKER_MAPPING VALUES ('AXP', 'American Express Company')")
cursor.execute("INSERT INTO TICKER_MAPPING VALUES ('GE' , 'General Electric Company')")
cursor.execute("INSERT INTO TICKER_MAPPING VALUES ('GS' , 'Goldman Sachs Group Inc')")
cursor.execute("INSERT INTO TICKER_MAPPING VALUES ('UTX' ,'United Technologies Corporation')")
# We can insert several records at once if we create a list of tuples of the
# data to insert, then call cursor.executemany.
closing_prices = [
('20160722', 'AXP', 64.28),
('20160722', 'GE' , 32.06),
('20160722', 'GS' , 160.41),
('20160722', 'UTX', 105.13)
]
cursor.executemany(
"INSERT INTO CLOSING_PRICES VALUES (?,?,?)", closing_prices
)
db.commit()
db.close()
The (?,?,?)
in cursor.executemany
serve as placeholders for columns in the target table. There should be
one ?
for each column in the target table.
A more common scenario may be loading data from delimited data file into an SQLite database table. The syntax is similiar, with added file handling.
"""
=====================================
Method #2: Data read in from .csv |
=====================================
Requires ticker_data.csv file.
"""
import sqlite3
import csv
db = sqlite3.connect("sample.db")
cursor = db.cursor()
# Open ticker_data.csv, and create a csv.reader instance. Then call
# executemany on the records read from file to load into the database.
with open('ticker_data.csv', 'r') as ff:
fcsv = csv.reader(ff)
# Read records from file into list.
recs_to_load = [record for record in fcsv]
# Load records into CLOSING_PRICES table.
cursor.executemany("INSERT INTO CLOSING_PRICES VALUES (?,?,?)", recs_to_load)
db.commit()
db.close()
Retrieving Data
To retrieve SQLite database records, an iterator in the form of a database cursor is returned, which is traversed to obtain to returned dataset elements.
import sqlite3
# Reestablish connection to sample.db database.
db = sqlite3.connect("sample.db")
cursor = db.cursor()
# Construct a query to retrieve data from CLOSING_PRICES.
SQL = "SELECT * FROM CLOSING_PRICES"
cursor.execute(SQL)
# cursor can now be iterated over.
for rec in cursor:
print(rec)
db.close()
Headers need to be extracted from the cursor.description
attribute:
db = sqlite3.connect("sample.db")
cursor = db.cursor()
# Construct query to retrieve data from CLOSING_PRICES table.
SQL = "SELECT * FROM CLOSING_PRICES"
cursor.execute(SQL)
# Capture table headers into `headers` list.
headers = [ii[0] for ii in cursor.description]
db.close()
Using Bind Variables
The following demonstrates the use of bind variables in SQLite for dynamic data retrieval:
db = sqlite3.connect("sample.db")
cursor = db.cursor()
# Bind variable key-value pairs.
params = {'symbol':'GE','date':'20161125'}
SQL = "SELECT * FROM CLOSING_PRICES WHERE TICKER=:symbol AND DATE!=:date"
cursor.execute(SQL, params)
# Read records into list iterating over cursor.
records = [record for record in cursor]
db.close()
Final Note
At times, it can be useful to interact with SQLite databases from a graphical interface, especially as the number of tables grows. One such tool is SQLiteStudio, a SQLite IDE that includes the tools necessary to manage databases, schemas, tables and related objects.