Fuzzy Time Series Using the pyFTS Library

Overview

A friend asks you how the weather is. Your reply depends on the season, right? Hot or cold. But will it always be either hot or cold? No. Sometimes it will be “somewhat hot”; sometimes it will be “too cold”. Our mind does not see only two extremes. There is always some middle ground.

That’s exactly how fuzzy logic works. Each point in our universe of discourse can simultaneously belong to multiple categories, which can also be referred to as ‘fuzzy sets’. The association to each fuzzy set is defined by the membership grade, which depends upon the membership function. The triangular membership function is most commonly preferred.

Recently, there has been a lot of progress in the field of fuzzy time series forecasting. We have the introduction of some new libraries like pyFTS (developed by the Federal University of Minas Gerais (Brazil) with the Federal Institute of North of Minas Gerais and the Federal Institute of Minas Gerais). But why use fuzzy logic in time series forecasting?

The basic concept of fuzzy time series is to divide the universe of discourse into multiple intervals and then decipher the rules on how these intervals relate to each other with time. Let’s use the example of time series forecasting for Alabama University enrolment data to understand this in more detail. A link to the code can be found at the end of the article.

Methodology

Defining the universe of discourse – We first define our universe of discourse, which is generally the lower and upper value of the Y variable, with margins usually set at 20%. In this case, the bounds will be defined by the minimum and maximum enrolments across the years.
Partitioning – After defining the universe, we divide it into multiple overlapping intervals. These become our fuzzy sets. There are several methods of partitioning; the easiest one is grid partitioning, in which all intervals have the same length.
Fuzzification – This step involves transforming numerical values to the fuzzy sets that were defined in the last step. We should remember that each y value can belong to multiple fuzzy sets, but for simplification, we can assign the interval with the maximum membership grade, e.g. Y1 puts 0.8 in set A0 and 0.2 in set A1. We will assign it A0.
Creating patterns – We follow the Precedent -> Consequent format. When all Y values are fuzzified, they will form a series, e.g. A0 – A1 – A1 – A2 – A3 – A4 – A4 – A5, etc. Whenever we get similar sets at f(t) and f(t+1), we can define some patterns.
Creating rules – Based on the patterns from the previous step, the model will create rules. For example:
A0 -> A1
A1 -> A1, A2
A2 -> A3
A3 -> A4
A4 -> A4, A5
These rules form the backbone of our model. They will be used to forecast unseen data.

After training the model, it’s time to make predictions. The steps below will be followed:

Input value fuzzification – As discussed earlier, the numeric y value will be assigned to the fuzzy set. We have a y value for T=t and we want to predict for T=t+1. For example, f(t) belongs to A4.
Finding compatible rules – Now we will select the most appropriate rule from our list, e.g. since f(t) = A4, we will use the rule A4 -> A4, A5.

Defuzzification – Since we need the numerical values in the final output, we will convert these sets back to their original type. This will be equal to the mean of the centers of the fuzzy sets.

Now you can understand that rule readability makes it highly efficient. For example, this could tell an investor if “moving average” is low and “Relative Strength Index” is low, they should sell!

Illustration

Problem Statement: Time series forecasting of Alabama University Enrolment Data using pyFTS library. More about the library at https://pyfts.github.io/pyFTS/build/html/index.html.

Data– We will use Alabama University’s enrolment data over the past several years.

Albama University Enrollment Data

Code (Google Colab, taken from external source): https://colab.research.google.com/drive/1S1QSZfO3YPVr022nwqJC5bEJvrXbqS_A

References

Petrono Silvia, Ph.D. A Short Tutorial on Fuzzy Time Series
https://towardsdatascience.com/a-short-tutorial-on-fuzzy-time-series-dcc6d4eb1b15
Chen Shyi-Ming. Forecasting enrollments based on fuzzy time series
http://www.softcomputing.net/nf_chapter.pdf

-Authored by Rhydham Gupta, Data Scientist at Absolutdata

Technical articles are published from the Absolutdata Labs group, and hail from The Absolutdata Data Science Center of Excellence. These articles also appear in BrainWave, Absolutdata’s quarterly data science digest.

Subscribe to BrainWave