- Inherent uncertainty: ML components insert a new kind of uncertainty into software systems. While software developers and architects are used to design, build, and test their systems to be able to deal with external factors of uncertainty (network latency, unpredictable user behavior, unreliable hardware), they must now deal with internal components that behave in a non-deterministic fashion. ML components map inputs to outputs in a probabilistic fashion. Take for instance an image-recognition component, that categorizes the input as a cat or a dog, with a certain level of probability, rather than having a crisp outcome.
- Data-driven behavior: The behavior of ML components is only very partially determined by the logic that a programmer writes. Instead, behavior is learned from data. This huge dependence on (large volumes of) data, already early in the development stage, changes the development and deployment processes in fundamental ways. Data cleaning, versioning, and wrangling become essential parts of the development cycle. Also in the deployment stage, new challenges arise, such as the need to monitor the (statistical) characteristics of production data versus training data.
- Rapid experimentation: Development of ML components is strongly experiment based, where different ML models and different sets of parameters are attempted and evaluated in rapid succession and often in parallel, in order to continuously optimize behavior. This puts the iterative nature of agile software development in over-drive. Where sprints may take 2 to 3 weeks, ML experiments are sometimes initiated, evaluated, and then discarded or adopted within hours.

How ML Challenges Software Engineering
TL;DR →
Traditional software engineering methods have been designed and optimized to build high-quality software in a controlled and cost-effective manner. When building software systems that include Machine Learning (ML) components, those traditional software engineering method are challenged by three distinctive characteristics: Inherent uncertainty: ML components insert a new kind of uncertainty into software systems. Data-driven behavior: The behavior of ML components is only very partially determined by the logic that a programmer writes. Instead, behavior is learned from data. Data cleaning, versioning, and wrangling become essential parts of the development cycle.
This story on HackerNoon has a decentralized backup on Sia.
Meta Data: 📄