I can connect to my local mysql database from python, and I can create, select from, and insert individual rows.
My question is: can I directly instruct mysqldb to take an entire dataframe and insert it into an existing table, or do I need to iterate over the rows?
In either case, what would the python script look like for a very simple table with ID and two data columns, and a matching dataframe?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Update:
There is now a to_sql
method, which is the preferred way to do this, rather than write_frame
:
df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')
Also note: the syntax may change in pandas 0.14…
You can set up the connection with MySQLdb:
from pandas.io import sql
import MySQLdb
con = MySQLdb.connect() # may need to add some other options to connect
Setting the flavor
of write_frame
to 'mysql'
means you can write to mysql:
sql.write_frame(df, con=con, name='table_name_for_df',
if_exists='replace', flavor='mysql')
The argument if_exists
tells pandas how to deal if the table already exists:
if_exists: {'fail', 'replace', 'append'}
, default'fail'
fail
: If table exists, do nothing.
replace
: If table exists, drop it, recreate it, and insert data.
append
: If table exists, insert data. Create if does not exist.
Although the write_frame
docs currently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.
Method 2
Andy Hayden mentioned the correct function (to_sql
). In this answer, I’ll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x):
First, let’s create the dataframe:
# Create dataframe import pandas as pd import numpy as np np.random.seed(0) number_of_samples = 10 frame = pd.DataFrame({ 'feature1': np.random.random(number_of_samples), 'feature2': np.random.random(number_of_samples), 'class': np.random.binomial(2, 0.1, size=number_of_samples), },columns=['feature1','feature2','class']) print(frame)
Which gives:
feature1 feature2 class 0 0.548814 0.791725 1 1 0.715189 0.528895 0 2 0.602763 0.568045 0 3 0.544883 0.925597 0 4 0.423655 0.071036 0 5 0.645894 0.087129 0 6 0.437587 0.020218 0 7 0.891773 0.832620 1 8 0.963663 0.778157 0 9 0.383442 0.870012 0
To import this dataframe into a MySQL table:
# Import dataframe into MySQL import sqlalchemy database_username = 'ENTER USERNAME' database_password = 'ENTER USERNAME PASSWORD' database_ip = 'ENTER DATABASE IP' database_name = 'ENTER DATABASE NAME' database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'. format(database_username, database_password, database_ip, database_name)) frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')
One trick is that MySQLdb doesn’t work with Python 3.x. So instead we use mysqlconnector
, which may be installed as follows:
pip install mysql-connector==2.1.4 # version avoids Protobuf error
Output:
Note that to_sql
creates the table as well as the columns if they do not already exist in the database.
Method 3
You can do it by using pymysql:
For example, let’s suppose you have a MySQL database with the next user, password, host and port and you want to write in the database ‘data_2’, if it is already there or not.
import pymysql user = 'root' passw = 'my-secret-pw-for-mysql-12ud' host = '172.17.0.2' port = 3306 database = 'data_2'
If you already have the database created:
conn = pymysql.connect(host=host, port=port, user=user, passwd=passw, db=database, charset='utf8') data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')
If you do NOT have the database created, also valid when the database is already there:
conn = pymysql.connect(host=host, port=port, user=user, passwd=passw) conn.cursor().execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database)) conn = pymysql.connect(host=host, port=port, user=user, passwd=passw, db=database, charset='utf8') data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')
Similar threads:
Method 4
The to_sql method works for me.
However, keep in mind that the it looks like it’s going to be deprecated in favor of SQLAlchemy:
FutureWarning: The 'mysql' flavor with DBAPI connection is deprecated and will be removed in future versions. MySQL will be further supported with SQLAlchemy connectables. chunksize=chunksize, dtype=dtype)
Method 5
Python 2 + 3
Prerequesites
- Pandas
- MySQL server
- sqlalchemy
- pymysql: pure python mysql client
Code
from pandas.io import sql from sqlalchemy import create_engine engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}" .format(user="root", pw="your_password", db="pandas")) df.to_sql(con=engine, name='table_name', if_exists='replace')
Method 6
You might output your DataFrame
as a csv file and then use mysqlimport
to import your csv into your mysql
.
EDIT
Seems pandas’s build-in sql util provide a write_frame
function but only works in sqlite.
I found something useful, you might try this
Method 7
This has worked for me. At first I’ve created only the database, no predefined table I created.
from platform import python_version print(python_version()) 3.7.3 path='glass.data' df=pd.read_csv(path) df.head() !conda install sqlalchemy !conda install pymysql pd.__version__ '0.24.2' sqlalchemy.__version__ '1.3.20'
restarted the Kernel after installation.
from sqlalchemy import create_engine engine = create_engine('mysql+pymysql://USER:<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="673726343430283523272f283433">[email protected]</a>:PORT/DATABASE_NAME', echo=False) try: df.to_sql(name='glasstable',con=engine,index=False, if_exists='replace') print('Sucessfully written to Database!!!') except Exception as e: print(e)
Method 8
This should do the trick:
import pandas as pd import pymysql pymysql.install_as_MySQLdb() from sqlalchemy import create_engine # Create engine engine = create_engine('mysql://USER_NAME_HERE:<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="52021301010d1a170017121a1d0106">[email protected]</a>_ADRESS_HERE/DB_NAME_HERE') # Create the connection and close it(whether successed of failed) with engine.begin() as connection: df.to_sql(name='INSERT_TABLE_NAME_HERE/INSERT_NEW_TABLE_NAME', con=connection, if_exists='append', index=False)
Method 9
df.to_sql(name = “owner”, con= db_connection, schema = ‘aws’, if_exists=’replace’, index = >True, index_label=’id’)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0