Reduce record count in MySQL temp sensor table
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

A MySQL table (TempLog) can contain 0 to many temperature recordings per hour, per station.
If more than one record exists, we need only 1 record per station per hour, preferably the highest temperature.

Need a solution in SQL or Python3 (using Pymysql) that deletes records, leaving only one record per hour in the table.

The TempLog table structure is:

-ID int(11),

-Station varchar(255),

-logDate datetime,
-temp int,

-humidity int.

Sample data:
ID Station logDate temp humidity

3 102upstairs 2018-01-04 20:20:02 22 30

4 102upstairs 2018-01-04 20:20:02 22 30

5 102upstairs 2018-01-04 20:40:02 22 30

6 102basement 2018-01-04 20:50:02 17 44

7 102upstairs 2018-01-04 21:40:04 22 31

You need a SQL that gets an hourly max temp for each station, am i right?
enderdba 3 months ago
Tags
MySQL
python3

Crowdsource coding tasks.

4 Solutions


The following will give you one measurement per station/hour (not necessarily the one with the highest temp):

SELECT * FROM TempLog GROUP BY Station, DATE_FORMAT(logDate, '%Y%m%d%H');

To select the one with the highest temperature within a station/hour:

SELECT *, MAX(temp) AS max FROM TempLog GROUP BY Station, DATE_FORMAT(logDate, '%Y%m%d%H');

To delete all but one record per hour/station:

DELETE TempLog WHERE ID NOT IN (
    SELECT ID FROM TempLog GROUP BY Station, DATE_FORMAT(logDate, '%Y%m%d%H');
);

Similarly, but chosing to maintain the records with the highest temperature per hour/station:

DELETE FROM TempLog WHERE ID NOT IN (
  SELECT ID FROM (
    SELECT ID, MAX(temp) from TempLog GROUP BY Station, DATE_FORMAT(logDate, '%Y%m%d%H')
  )
);
Thanks, I missed explaining that I want to delete all but one per hour.
broadreach 3 months ago

This SQL should do it

SELECT Station, DATE_FORMAT(logDate, "%Y-%m-%d %H") "Date", MAX(temp), humidity
FROM TempLog
GROUP BY Station, DATE_FORMAT(logDate, "%Y-%m-%d %H"), humidity;

It uses the MAX correlated query together with the GROUP BY to get the maximum temperature per hour.

To delete all but those selected use:

DELETE FROM TempLog
WHERE id NOT IN (
  SELECT id
  FROM TempLog
  WHERE (Station, temp, humidity) IN (
    SELECT Station, MAX(temp), humidity
    FROM TempLog
    GROUP BY Station, DATE_FORMAT(logDate, "%Y-%m-%d %H"), humidity
  )
);
On the delete statement I get the following SQL Error: Error : You can't specify target table 'TempLog' for update in FROM clause Thanks for looking at this!
broadreach 3 months ago

This will select only 1 record per station per hour, with the highest temperature only

For Select

SELECT a.ID, a.Station, a.logDate, max(a.temp), a.humidity
FROM templog a
GROUP by a.Station,hour(logDate)

For Delete

Delete from templog where id NOT IN
( SELECT ID from (
SELECT a.ID, max(a.temp) FROM templog a GROUP by a.Station,hour(a.logDate)
) as temp
);

O/p: https://ibb.co/mGTbAJ
SilverHood Apps 3 months ago
For delete op: https://ibb.co/jQfEVJ
Works perfectly fine..gives same o/p as above https://ibb.co/noN04d
SilverHood Apps 3 months ago
@broadreach Thank you. :)
SilverHood Apps 3 months ago
Actually, just grouping by hour will be buggy if you have multiple days with the same hour period.
alixaxel 3 months ago
Hi @alixaxel thanks, thought about it a lot however the requirement stated as follows: we need only 1 record per station per hour Im assumed the sensory system is a short lived one i.e within a day.. For more than a day @broadreach if you need,then add day(a.logDate) just before hour(a.logDate) and its good to go.. Im here if more assistance is required :)
SilverHood Apps 3 months ago

python 2.7

import pymysql

db = pymysql.connect("localhost", "root", "", "TESTDB")
cursor = db.cursor()
ID = []


try:
    sql_select = 'SELECT Station, logDate, MAX(temp),humidity FROM TempLog GROUP BY Station, DATE_FORMAT(logDate, "%Y-%m-%d %H")'
    cursor.execute(sql_select)
    results = cursor.fetchall()
    results = list(results)
    if results:
        try:
            cursor.execute("TRUNCATE TABLE  TempLog")
            cursor.executemany("INSERT INTO TempLog (Station, logDate, temp, humidity) VALUES (%s, %s, %s, %s)", results)
            db.commit()
        except:
            print "An Error Occured"

except:
    print ("Error: unable to fetch data")
    db.rollback()

db.close()
Thanks, we may use this too.
broadreach 3 months ago
Yup, No problem.
Codeword 3 months ago
I left a $10 tip for this, did you get it? it didn't show in my account info.
broadreach 3 months ago
Thanks but it's not showing in my account too ;)
Codeword 3 months ago
got it, thanks broadreach.
Codeword 3 months ago
View Timeline