5S Data: Sort

Goal and motivation
In the first step "Sort" unnecessary data in the project should be deleted. This not only reclaims storage space but also helps you keep track of the most important data and helps you, as well as other employees, to find your way around in a smaller number of files. There are several things to keep in mind during this process.
Implementation
Firstly, duplicates and old versions of files should be deleted. Even though people like to keep them as backups, the actual backup should be done by a system. It is practically impossible to manually back up all of one's files. In addition to the time it would take to do this, you would also quickly lose track of all the versions and copies. It is therefore recommended to use a backup system that creates automatic copies (e.g. according to the 3-2-1 backup rules) and to use a versioning system that documents work steps on the files and makes files easier to restore.
Various temporary files can arise during project work. Many programs generate these, for example, to log their processes, to temporarily save files for certain sections or to restore project data in the event of a crash. At the end of the project at the latest, you don't want to store these files anymore. It is therefore advisable not to use a versioning system (such as git) to log unwanted files (using .gitignore). A command can be used, for example, to instruct the system to delete all unwanted files in a project (like delete all files that end with ".temp"). Of course, it should be carefully analyzed which files are still important for the programs and which can be deleted without problems.
A final point is the fact that certain files simply cannot be deleted or should not be deleted at this time. This is often the case, for example, when a permanent application (such as a server) accesses a file over a long period of time and this file is locked for other operations (such as deletion operations). However, to prevent this file from being forgotten, it should either be moved (if possible) to a folder whose contents are marked for later deletion, or the file itself should be tagged. Some systems allow you to give files tags or similar markings that make it easier to find the files to be deleted later.
1) Sort ► 2) Set in order ► 3) Shine ► 4) Standardize ► 5) Sustain
5S Data - Set in order

Goal and motivation
In the second step, "set in order", folders and files should be organized in such a way that work areas in a project are better mapped and certain files can be found more easily. This saves time when working with the data and employees or people involved later can find their way around the project more easily at first glance.
Implementation
First, a folder structure should be created that corresponds to the work areas within a project. In the sciences, these are usually areas such as finances, presentations, publications, studies, source code or literature. Since sorting the folders by initial letters is often not useful, because certain areas may be more important than others, the names of the folders can be started with a number to let the operating system sort them automatically. This can be e.g. "01_studies" and "03_publications". As a rule of thumb, all folders should be easy to grasp on one level (i.e. about 7 different folders) and no more than 3 folder levels should be created in-depth so that one does not have to click unnecessarily deep into the folder structure to get to a certain file.
 
As already indicated in the last paragraph, the naming convention of the folders and files plays an important role. Through them, the system can already automatically make certain orders and as a human being, you can already recognize at first glance what contents the folder or the file can you expect. So that one can open the contents in a file fast, the name should already contain metadata, like e.g. the publication date, the author or creator and the title. By putting a date at the beginning of the file name, in the order year-month-day, the system can already sort them correctly in time. Another point is the use of the correct characters. Depending on the system or software, different characters in the name can cause problems when processing files. It is therefore recommended to use only lowercase letters, underscores or hyphens instead of spaces, and no special characters in file names, such as periods or question marks. For example, the name for the file of a publication would be: 
"20200316_tkfdm_fact_sheet_research_data_repositories".
Since titles or other meta-information can be quite long, it should be noted here that the file name with the path of the folder structure must not be longer than 255 characters on all common systems. Care should therefore be taken to use descriptions that are as short as possible.
 
Of course, it is not always possible to see at first glance which files can be found in which folders. It is therefore advisable to create a readme file, a wiki page or similar at an early stage of the project that describes why this folder structure was chosen and where you can roughly find which files.
1) Sort ► 2) Set in order ► 3) Shine ► 4) Standardize ► 5) Sustain
5S Data - Shine

Goal and motivation
In the third step "shine", the order once created should be maintained. Typical deviations from the target state can be identified through regular checks over time. Non-functional structures, on the other hand, catch the eye more quickly and can be adjusted.
Implementation
It should be checked at set intervals that the defined order and organization from the first two 5S steps are maintained. Thus, at the end of a working day or after reaching a previously defined milestone, the newly added files can be checked for completeness, correct naming as well as correct classification in the structure. Redundant and/or obsolete files are deleted after it has been verified that a more up-to-date file exists and that the present file should not be retained as an interim result. At the same time, this verification ensures that the current files actually contain the respective current work results and that new copies of older files are not accidentally considered by the system to be the most current versions.
 
The documentation of the project must be supplemented if new work steps or programs have been introduced or variables or measurement settings have been changed within experiments. If work steps within the project do not proceed digitally but are documented on paper (e.g., field observations, invoice receipt, etc.), it makes sense to digitize the documentation for storage in the existing folder structure to keep the workstation paper-free and the documentation completely digital. If this is impractical, at least the filing location and brief content description of the documents should be included in the appropriate location as a reference to simplify the search for the physical materials.
 
To ensure that the daily polishing of one's working environment fully fulfills its purpose, it is helpful to document the deviations of the actual from the target state concisely. If, for example, certain errors occur more frequently in folder naming and assignment, or if structures become too large and thus unclear, then you should consider adjusting the structure or changing your working environment to make it easier to adhere to the structure.
1) Sort ► 2) Set in order ► 3) Shine ► 4) Standardize ► 5) Sustain
5S Data - Standardize

Goal and motivation
In the fourth step "Standardize," a method is developed that supports the new approach and makes it part of the daily work routine. A standard is created, i.e. a best practice is documented, which all employees are expected to follow. As a fixed rule or guideline, the standard describes the workflow briefly, concisely and in an understandable way. To increase the acceptance of the implementation in the team, it is recommended to involve the employees in the standardization process and to reach an agreement on how work processes will be designed in the future. In addition, established standards from the research community should be used, if available.
Implementation
An SOP (Standard Operation Procedure) can be used to document the standard procedure, especially for workgroups or cross-workgroup consortia that work in common directories and/or frequently share files. It contains a binding textual description of the standardized procedure and is usually divided into objectives, scope of application, description of the workflow and responsibilities. The SOP can specify the organizational structure and designation of folders and files, but also describe how to handle the established folder and file system:
- How are folders structured?
- How are folders and files named (file naming conventions)?
- Where are certain files stored (e.g. procurement processes under the project folder or finances)?
- Who is allowed to create new folder structures in shared directories?
- In which formats are files stored?
- Which international and/or discipline-specific standards are adopted and applied?
SOPs should be stored centrally and easily available to all team members. If SOPs are newly introduced, all employees should be trained. Readme files, as an alternative to a detailed SOP, can also briefly summarize the new procedure. They should preferably be stored as text files in the appropriate folders. They are also useful for documenting your own work standards.
When working in large teams, it can make sense to limit access rights to certain folders and files and, for example, to give them read-only rights. For example, if people from different workgroups are working on shared laboratory PCs, a notice at the workstation that summarizes the standards in a brief and visualized form can remind users of the new procedure regularly and facilitate collaborative work. Alternatively, the entry of files can also be carried out as a standard task by selected employees who are responsible for the assigned laboratory equipment, for example.
1) Sort ► 2) Set in order ► 3) Shine ► 4) Standardize ► 5) Sustain
5S Data - Sustain

Goal and motivation
In the fifth step "Sustain", discipline should be developed. This is the most difficult step because following the rules and standards should become a daily routine. This means constantly fighting against chaos and disorder. The standardized procedure should be internalized by the employees in such a way that it can be applied without thinking about it.
Implementation
Self-discipline is first and foremost a requirement for each team member. The development of a good work attitude is crucial here. Only when everyone internalizes the standardized procedure and disciplines themselves to follow the rules improvements can occur. Supervisors have a special role model function here.
This process is additionally supported by the central and easily accessible filing of the written standard operating procedure. It should also be an essential part of the introductory information for new employees. It is also recommended that the standard operating procedure be presented or recalled at regular intervals (e.g., every six months or annually) in internal group workshops. The following topics can be addressed:
- File storage (folder system and naming convention)
- Server setup
- Backup strategy
- Versioning tools (e.g. GIT)
- Literature management programs
- Internal databases
- Handling of personal data
- Communication tools
Workshops not only provide an opportunity to present the approach but also to gather feedback from team members. Existing structures are thus regularly checked for their functionality and can be adapted as needed and in a collaborative manner.
Even if it takes some time and checks until the new work processes and standards have manifested themselves in a team, the standardization of work processes is worthwhile. Documents are found more easily (e.g., in collaborative work or substitutions) and valuable data volume is gained through space savings (e.g., continuous cleanup and deletion). In the end, not only productivity increases but also employee satisfaction.
1) Sort ► 2) Set in order ► 3) Shine ► 4) Standardize ► 5) Sustain