Customer Support: 131 242

  • There are no items in your cart
We noticed you’re not on the correct regional site. Switch to our AMERICAS site for the best experience.
Dismiss alert

BS ISO 28500:2017

Current

Current

The latest, up-to-date edition.

Information and documentation. WARC file format

Available format(s)

Hardcopy , PDF

Language(s)

English

Published date

11-09-2017

Foreword
Introduction
1 Scope
2 Normative references
3 Terms, definitions and abbreviated terms
4 File and record model
5 Named fields
6 WARC record types
7 Record segmentation
8 WARC file name, size and compression
Annex A (informative) - Use cases for writing WARC
        records
Annex B (informative) - Examples of WARC records
Annex C (informative) - WARC file size and name
        recommendations
Annex D (informative) - Compression recommendations
Bibliography

Defines the WARC file format: - to store both the payload content and control information from mainstream Internet application layer protocols, such as the HTTP, DNS, and FTP; - to store arbitrary metadata linked to other stored data (e.g. subject classifier, discovered language, encoding); - to support data compression and maintain data record integrity; - to store all control information from the harvesting protocol (e.g. request headers), not just response information; - to store the results of data transformations linked to other stored data; - to store a duplicate detection event linked to other stored data (to reduce storage in the presence of identical or substantially similar resources); - to be extended without disruption to existing functionality; - to support handling of overly long records by truncation or segmentation, where desired.

This document specifies the WARC file format:

  • to store both the payload content and control information from mainstream Internet application layer protocols, such as the HTTP, DNS, and FTP;

  • to store arbitrary metadata linked to other stored data (e.g. subject classifier, discovered language, encoding);

  • to support data compression and maintain data record integrity;

  • to store all control information from the harvesting protocol (e.g. request headers), not just response information;

  • to store the results of data transformations linked to other stored data;

  • to store a duplicate detection event linked to other stored data (to reduce storage in the presence of identical or substantially similar resources);

  • to be extended without disruption to existing functionality;

  • to support handling of overly long records by truncation or segmentation, where desired.

Committee
IDT/2
DevelopmentNote
Supersedes 08/30167515 DC. (08/2009) Supersedes 16/30345920 DC. (09/2017)
DocumentType
Standard
Pages
36
PublisherName
British Standards Institution
Status
Current
Supersedes

Standards Relationship
ISO 28500:2017 Identical

ISO 8601:2004 Data elements and interchange formats Information interchange Representation of dates and times

View more information
$478.43
Including GST where applicable

Access your standards online with a subscription

Features

  • Simple online access to standards, technical information and regulations.

  • Critical updates of standards and customisable alerts and notifications.

  • Multi-user online standards collection: secure, flexible and cost effective.

Need help?
Call us on 131 242, then click here to start a Screen Sharing session
so we can help right away! Learn more