File and Folder Names and Organization
The requirements in this section are critical to complete your upload in a timely manner. If the data delivery does not meet all of these requirements then there will be significant delays in completing the upload and you may be billed for the associated technical time.
Special Characters
File and folder names should not include characters other than alphanumeric characters or dashes. Special ASCII characters should not be used. For example, do not include ampersands (&), commas (,) or apostrophes (‘) in file or folder names. If paths to files include other types of characters, we may have to rename the files and/or folders, adding additional time and cost to the upload process. All files should be named after the begcontrol [or begbates] number (e.g., begcontrol.ext) and the naming of the files is case sensitive.
File Name Length: The name of a file should not exceed 25 characters as various processes may need to be run against the files.
Path Length: The path for a file should not exceed 255 characters, including the name of the file.
Folder Organization: A single folder should not contain more than 10,000 files as this will impact processing of the files. If a folder contains more than 10,000 files then there may be speed impacts when reviewing the files in that folder.
File Organization
All files that correspond to a record should have an associated field in the load file that includes the full path and filename for the specific file. For example, if a record will have a native file and a text file loaded, then there should be a field called “Filepath” and a field “Textpath” in the loadfile. The data in that field will point to the specific file as it appears in the uncompressed ZIP/RAR file.
Below are two common delivery configurations that illustrate how each record has one entry in the load file that references the two potential file types.
Native File Example:
B1 Files001\0egControl Filepath Textpath
0000000001.DOC Files001\000001.TXT
Image File Example:
BegControl Filepath Indexpath
000002 Files001\000002.PDF Files001\000002.TXT
Extraneous Files
Please be aware that any files included in deliveries will be included in storage costs, regardless of their use on the site. For example, OCR text files that are not used because native files are being indexed and viewed, or files that do not have corresponding metadata records, will be billed even though they are not accessible on the site.
Zero-Filled Dates
Catalyst handles incomplete dates that are zero-filled (those with 00 for the month and/or day and/or 0000 for the year) as follows:
Any date field with a value of 00/00/0000 or 00000000 will be made NULL/empty.
Any date with 00 for month or day will be made NULL/empty.
Any date with 0000 for the year will be made NULL/empty.
The deleted date information is stored in the XML file in the “indexissuedetail” field. By keeping the invalid date data, the document will still come back in results when a search is run against it using the invalid dates.
Catalyst can accept true date and time data into the date fields, but for optimized search functionality we ask that any date/time fields be split into separate date fields, and separate time fields.