Using bulk_update_mappings in SQLAlchemy to update multiple rows with different values

I have two tables Foo and Bar. I just added a new column x to the Bar table which has to be populated using values in Foo

class Foo(Base):
    __table__ = 'foo'
    id = Column(Integer, primary_key=True)
    x = Column(Integer, nullable=False)

class Bar(Base):
    __table__ = 'bar'
    id = Column(Integer, primary_key=True)
    x = Column(Integer, nullable=False)
    foo_id = Column(Integer, ForeignKey('foo.id'), nullable=False)

One straightforward way to do it would be iterating over all the rows in Bar and then updating them one-by-one, but it takes a long time (there are more than 100k rows in Foo and Bar)

for b, foo_x in session.query(Bar, Foo.x).join(Foo, Foo.id==Bar.foo_id):
    b.x = foo_x
session.flush()

Now I was wondering if this would be right way to do it –

mappings = []
for b, foo_x in session.query(Bar, Foo.x).join(Foo, Foo.id==Bar.foo_id):
    info = {'id':b.id, 'x': foo_x}
    mappings.append(info)
session.bulk_update_mappings(Bar, mappings)

There are not much examples on bulk_update_mappings out there. The docs suggest

All those keys which are present and are not part of the primary key
are applied to the SET clause of the UPDATE statement; the primary key
values, which are required, are applied to the WHERE clause.

So, in this case id will be used in the WHERE clause and then that would be updates using the x value in the dictionary right ?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The approach is correct in terms of usage. The only thing I would change is something like below

mappings = []
i = 0

for b, foo_x in session.query(Bar, Foo.x).join(Foo, Foo.id==Bar.foo_id):
    info = {'id':b.id, 'x': foo_x}
    mappings.append(info)
    i = i + 1
    if i % 10000 == 0:
        session.bulk_update_mappings(Bar, mappings)
        session.flush()
        session.commit()
        mappings[:] = []

session.bulk_update_mappings(Bar, mappings)

This will make sure you don’t have too much data hanging in memory and you don’t do a too big insert to the DB at a single time


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x