Incorrect number of rows returned by JOIN query in Python script

Klausos Klausos

I wrote the following code in Python that connects to DB, creates two tables and joins them. Then it prints the result of the JOIN query.

The problem is that the number of rows is 3, though I expected to get 2 rows. Also, if I run the same query in a command prompt using sqlite>, then the number of rows returned by the JOIN is correct, i.e. 2.

import sqlite3 as lite
import pandas as pd

# Connecting to the database. The `connect()` method returns a connection object.
con = lite.connect('getting_started.db')

with con:
    cur = con.cursor()
    cur.execute("DROP TABLE IF EXISTS cities")
    cur.execute("DROP TABLE IF EXISTS weather")

    cur.execute("CREATE TABLE cities (name text, state text)")
    cur.execute("CREATE TABLE weather (city text, year integer, warm_month text, cold_month text, average_high integer)")

    # Filling 'cities' with the data
    cur.execute("INSERT INTO cities VALUES('Washington', 'DC')")
    cur.execute("INSERT INTO cities VALUES('Houston', 'TX')")

    # Filling 'weather' with the data
    cur.execute("INSERT INTO weather VALUES('Washington', 2013, 'July', 'January', 59)")
    cur.execute("INSERT INTO weather VALUES('Houston', 2013, 'July', 'January', 62)")

    # Joining data together
    sql = "SELECT name, state, year, warm_month, cold_month FROM cities " \
          "INNER JOIN weather " \
          "ON name = city"
    cur.execute(sql)

rows = cur.fetchall()
cols = [desc[0] for desc in cur.description]

# Loading data into pandas
df = pd.DataFrame(rows, columns=cols)

for index, row in df.iterrows():
    print("City: {0}, The warmest month: {1}".format(row['name'],row['warm_month']))

In Python the result is:

City: Washington, The warmest month: July
City: Washington, The warmest month: July
City: Houston, The warmest month: July

However, in command prompt the result is different (correct):

City: Washington, The warmest month: July
City: Houston, The warmest month: July
Tom Dalton

The issue is that your rows = cur.fetchall() is outside the con connection context manager, so something odd is happening when you use the cursor and it's DB connection has been closed.

Ref the docs here: https://docs.python.org/2/library/sqlite3.html#using-the-connection-as-a-context-manager it suggests that the with con: provides a transaction, which might explain the strange behaviour executing the statement in the transaction but then trying to use a cursor on the result set outside of the transaction.

It still seems odd to me and I would have expected that this use would have caused sqllite3 to raise an exception telling you this was happening.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Left join reducing the number of rows returned

SQL How to JOIN a child table without increasing the number of rows returned in the query?

get number of rows in join query

Unable to count the number of rows returned from query

Incorrect day number returned

JPA : Restrict query result number | Number of rows returned

Limit the number of rows returned from a query in both oracle and postgres

Get the total number of rows returned of a query using mysql

Does speed of select query depends on number of rows and columns returned?

Inner Join Query incorrect result

Laravel join on returned query with get()

Incorrect number of html elements being returned

SQL limit number of returned rows for each specified column value, in single query

How do I limit the number of rows returned by an Oracle query after ordering?

Natural join vs inner join giving different number of rows for a simple query in sqlite3?

Why does my left join query produce more rows than the number of rows in the left table?

postgresql RIGHT Join: limit returned rows

Limiting rows returned in a left join in MySQL issue

count the number of rows in query

MySQL join query for 2 tables: incorrect return

Incorrect number of rows per page in ngTable

SQL: Error showing incorrect number of rows

Rename returned values from SQL Join Query

GROUP BY the number of rows returned by GROUP BY in MySQL

Get Number of Rows returned by ResultSet in Java

Iterate the number of rows returned from a select statement

Count number of returned rows in OOP PHP

Is there any limit to the number of rows returned by API?

Count the number of rows returned in SQL ORACLE