Research Areas:

Big Data

‘Big data refers to data that would typically be too expensive to store, manage, and analyze using traditional (relational and/or monolithic) database systems. Usually, such systems are cost-inefficient because of their inflexibility for storing unstructured data (such as images, text, and video), accommodating “high-velocity” (real-time) data, or scaling to support very large (petabyte-scale) data volumes.’

‘Data analytics only returns more value when you have access to more data, so organizations across multiple industries have found big data to be a rich resource for uncovering profound business insights. And, because machine-learning models get more efficient as they are “trained” with more data, machine learning and big data are highly complementary.’ https://cloud.google.com/what-is-big-data

four challenges have to be considered:

    Volume (the sheer amount of data — the year 2025 will feature eight times more data than in 2017[2])
    Velocity (the speed with which data is generated and processed — e.g. streaming, IOT, social media)
    Variety (structured and increasingly unstructured data)
    Veracity (lack of data quality and missing know-how for evaluation)

    Typical characteristics of Big Data (Storage) Technologies are:

    Distributed Storage
    Data Replication
    Local Data Processing
    High Availability
    Data Partitioning
    Denormalized Data
    Working with Structured and unstructured data


Big Data: when the data amount reaches terabytes/petabytes or 
    traditional systems are no longer powerful enough and are also significantly more expensive when working with this kind of data amounts.


Even for analytics:

if there is any single guarantee, it’s that your data will grow over time--probably, exponentially. 


For ML products: 

    Another point is that the emerging field of deep/machine learning becomes more and more efficient by training with more data. Therefore, the area is a perfect addition to Big Data
https://towardsdatascience.com/what-big-data-actually-means-d4b00e8ae00

RCTs and Statistical Evaluation