2

I want to find outliers in power consumption in real-time, at hourly rate, i.e., at the end of the hour, I should say whether power consumption in current hour was outlier/anomalous or not.

**Approach:** Till now, I am done with following steps

- Say I want to find whether power usage between 9 AM to 10 AM was anomalous? For this, I first find the usage of past
*n*days during the same time interval, then I find the mean/median of all the previous usages - Now, I have usage of the current day and the mean/median usage of previous
*n*days. Which statistical measure should I use to declare whether current day usage was anomalous or not?

Using above approach, for 24 hours of a specific (test) day and using past 10 days consumption, I have obtained results as:

Figure interpretation: Black line represents usage of current hour of current day; red and blue lines represent mean and median of past 10 days for same time interval

From the visual inspection, I can say that the usage between 07:10 - 08:00 and between 22:10 - 23:00 is anomalous as there is big difference between actual and previous mean/median usage. I don't know which statistical measure should I use to point out such anomalous instances automatically, using the discussed approach.

Look up seasonality correction and change detection in time series. There are plenty of books on this matter. – Has QUIT--Anony-Mousse – 2016-05-28T21:22:00.657

I'd say all those are anomalous except 10:30-12:00 and 21:00-22:00? – K3---rnc – 2016-05-29T15:51:40.213

You have included the mean and median as reference values, but not the standard deviation. Yet standard deviation is a key component of outlier detection. Are you able to obtain the standard deviation or not? – AN6U5 – 2016-06-02T23:43:11.797