NSI/ISI Statistical software Issues and a way forward to maximise re-use and minimise integration efforts by Andrea Toniolo Staggemeier Content • Background • Case Studies Data collection Editing and Imputation Time Series Analysis Statistical Disclosure Control • Proposal for consideration • Conclusion and next steps The Big Picture - Today Business Functions and Operation Business Surveys MLD Demography Social Surveys Further Analysis NeSS Geography Corp. Services i-Diss Census Systems and Datastores SAS OpenRoad Excel ABF Ingres Super Cross Uniface M204 J2EE Oracle Notes Visual Basic Foxpro MS SQL SPSS Excel Foxpro Oracle 7 Clipper Clipper Blaise Citrix ESRI Blaise Infrastructure Numa Desktop ZSeries PSeries Wintel Server Sun The Big Picture - 2012 Business Functions and Operation Business Surveys MLD Administrative Sources Demography Social Surveys Multi-Channel Collection Further Analysis NeSS Business Process Mgt Geography Corp. Services i-Diss Warehouse for Analysis Metadata Census i-Dissemination Systems and Datastores SAS OpenRoad Excel ABF Ingres Super Cross M204 J2EE Oracle Notes Visual Basic SPSS MS SQL Excel Blaise Citrix ESRI Oracle 7 Infrastructure Desktop ZSeries PSeries Wintel Server Sun Linux Aim of this paper • The paper will discuss the following main concerns: (1) There is some great work being done within National Statistical Organisations on specialised statistical software. This is great software and works very well. (2) The challenge is that it is hard to predict what the long term support will be, whether there will be updates for the software, and how additional functionality can be added to meet specific requirements. (3) So the question to be resolved is - how do we turn very high quality 'unsupported' software into very high quality software with a real and guaranteed future that we would all be happy to invest in? Case study – Blaise for Data collection • Great for interview based data collection • Areas where we look for more robust solution Scalability Stream line technologies and minimum dependencies Serviceability – easy to manage/deploy Case Study – CANCEIS/Banff for Editing and Imputation • Great methodologies • Areas where we are looking for more robust solution Supportability Serviceability Integrability (integratability) Stream line architecture and open APIs Case Study – X12-ARIMA for Time Series Analysis • Rich functionality • Areas where we are looking for more robust solution Compatibility (consistent APIs between versions) Serviceability (release management transparency) Case Study – Tau-Argus for Statistical Disclosure Control • Rich methodologies • Areas where we are looking for more robust solution License agreement Support agreement Open APIs Better documentation Proposal for consideration • Create an IT development community amongst NSI/ISI(s) interested in making available statistical services/products. • Establish a governance agreement which comprises a sustainable development and support model for any service made available to the community. • Community members should establish a common development standard. Principles to be taken into consideration by community members are: • 1. Any statistical service should include enough methods to encompass needs of the parties of the cooperation 1.1. Be extendable to add new methods (parties own methodologies) 1.2. Be generalised to fulfil all significant needs of the parties Principles to be taken into consideration by community members are: (Cont.) • 2. Any statistical service created and made available by a community member should also publish full API(s) of the software enabling better integration. 2.1. when new release developments are planned the systems should first consider a SOAP approach Principles to be taken into consideration by community members are: (Cont.) 3. Statistical Standards and guides from international agencies should be use and new requirements for national standards proposed should be made public to all participants of the development community. Principles to be taken into consideration by community members are: (Cont.) 4. Common vocabulary, metadata models and data definitions coherent and consistent at all statistical value chain building blocks Principles to be taken into consideration by community members are: (Cont.) 5. Ensure integrity, confidentiality and security of systems and data at all times. Principles to be taken into consideration by community members are: (Cont.) 6. User access through consistent and easy to use interfaces and from any appropriate languages Principles to be taken into consideration by community members are: (Cont.) • 7. Sustainable agreement on maintenance and cooperation of the developed statistical services 7.1. Procedure for inclusion of needs of other parties of the cooperation. 7.2. Assurance of maintenance of the system (time scope) 7.3. High level support assuring continuity.