Paper notes # Abstract - Nuts and bolts of industrial large-scale software modification - Necessary when system owners hit "architectural barriers" - Discussion about the process for - Problem analysis - Pricing and contracting for projects - Design and implementation of tools for code exploration and modification - "Details of service delivery" (?) - Real-world example: deployed management information system - Required an invasive modification to make the system fit for future use - Around 90,000 LOC - Relevant architectural modification was "data expansion" # Introduction - Software systems that have been in use for a long time (5, 10, 20 years) - Require constant change to preserve or enhance the assets that are represented by these systems - Often lack malleability such that they resist to new requirements - This is not unique to systems from the past - systems that are deployed today will be the legacy of the future - Managed and automated modification of deployed software - Focus on 'revitalising malleability' (malleability - the capacity for adaptive change) - Architectural modifications - *The software architecture of deployed software is determined by those aspects that are the hardest to change* - Paper supplies a methodology for architectural modifications - ..and a real-world case - Treatise of a real-world case - Problem analysis: “How to use code exploration to learn about the modification problem? When to stop? How to get the customer involved?” - Cost estimation: “How to obtain a precise and transparent cost estimation in a reasonable amount of time and with costs that are acceptable for the customer?” - Managerial realities: “How to argue in favour of automated transformations as opposed to a manual approach? How to explain the complexity of the project to the system owner?” - Organisational realities: “How to align offline modification at the site of the service provider with simultaneous client-site maintenance?” - Technological issues: “What technology to use for modification? What amount of automation is justified for the project at hand?” # A real-world modification example - PRODCODE project - Extending product codes from two to three digits - System was developed in the early 1970s, provides managers with important productivity summaries on their finance and insurance products - Number of projects to be monitored had grown above 100, which was not supported by the system -> product code extension from 2 to 3 digits - Technical challenges - Problem statement seems simple; will turn out to be more complicated - Impact analysis: Not as trivial as it seems; you can update some PRODCODE fields to hold three digits, but the assumption that related fields are all called PRODCODE or alike is naive. Additionally, things other than fields may be relevant; for instance, loop conditions (for(i = 0; i < 100; ++i) never talks about prodcode, but is still relevant). You really want some sort of data flow analysis, but even that is not necessarily enough. - Heterogeneous system platforms - Real-world systems consist of scripts, glue code, generated files by preprocessors, different languages, dialects, etc etc - Impact analysis needs to deal with all of these - "very much challenges all attempts to derive simple solutions that are evidently complete and correct" - Semantical subtleties - Simple solutions are challenged by the subtleties of the used programming languages - For instance, in PRODCODE, they had to struggle with subtle rules for conversion between Cobol's types for numeric and alphanumeric data - Source of unsound modifications that look seemingly correct - Project drivers - TODO fill in later # Software asbestos Software becomes invaded by incidental or accidental issues, unintentionally making parts of the software system immutable and hampering anyone making necessary changes. This we call **software asbestos**. We note that the inevitability of software asbestos and the continued need to keep business-critical systems malleable is directly linked to some of _Lehman’s laws of software evolution_ (?) ## In COBOL's defense * Primary challenge for Prodcode: identifying fields for product codes, since they are not readily tagged in any way * Different types were used, e.g. PIC 99, PIC XX, PIC 999, others (12 (!) different types were used) * Hardcoded literals, i.e. 100 as an error code * This seems like bad design * However, COBOL did not until recently have ways of really fixing this (type declarations so you could have a PRODCODE type, constant declarations) * Then, COBOL is the issue? Yes. But also, no. * The existence of support for such things does not imply that they would have been used, and... * Every set of tools, languages, systems has their own form of software asbestos * Software asbestos is a fact of life [in the paper, a number of supporting statements follow] ## The future of contaminated systems * Conservation * Modification (what this paper is about) * Preventive modification * Replacement * Starvation ### Modification vs. replacement Replacement seems attractive, but can be costly. In fact, it is usually not worth it. ### Modification: automated vs. manual Manual modification is not tractable for large projects, basically. It is very expensive and error-prone. However, fully automated modification is also often problematic; usually a hybrid approach, where there is still some manual intervention is most reasonable. ## A definition of software architecture bla bla ze hebben een hele treatise over waarom hun definitie zo goed en heet en sexy is. oké. # Analysis of modification problems Process for analysing the actual problem -> problem specification -> estimation of effort and costs ## The process for problem analysis 1. Explore: use code exploration to learn about the problem 2. Model: Sketch an operational model that approximates the technical solution 3. Estimate: Estimate the effort needed to actually solve the identified part of the problem 4. Review: Reflect on the model to identify sources of incompleteness 5. Discuss: Discuss the findings with the customer's domain experts 6. Loop ## The initial problem statement for PRODCODE **Explore**: Customer told us about fields normally containing the string PRODCODE -> grep for PRODCODE # Implementation of tools * Simple code exploration: grep a.s., later on more advanced tools that involve grammar knowledge, employ non-trivial algorithms and data structures * Part of the modification project that allows for generic adaptations can be implemented in terms of automated program transformations id: bdb9602df6c54fc397d41c4af9328f47 parent_id: 4b103227ba5845c78164d1650b2a742f created_time: 2022-09-14T16:37:59.545Z updated_time: 2022-09-16T20:11:26.006Z is_conflict: 0 latitude: 51.81256260 longitude: 5.83722640 altitude: 0.0000 author: source_url: is_todo: 0 todo_due: 0 todo_completed: 0 source: joplin-desktop source_application: net.cozic.joplin-desktop application_data: order: 0 user_created_time: 2022-09-14T16:37:59.545Z user_updated_time: 2022-09-16T20:11:26.006Z encryption_cipher_text: encryption_applied: 0 markup_language: 1 is_shared: 0 share_id: conflict_original_id: master_key_id: type_: 1