http://dragoman.org/dagstuhl
Notes of M.T. Carrasco Benitez
-
URI:
A Uniform Resource Identifier (URI) is a compact sequence of
characters that identifies an abstract or physical resource.
RFC-3986 - URI syntax
-
Resource:
-
whatever might be identified by a URI
... a collection of other resources.
RFC-3986
-
A network data object or service that can be identified by a URI,
as defined in section 3.2. Resources may be available in multiple
representations (e.g. multiple languages, data formats, size, and
resolutions) or vary in other ways.
RFC-2616 - HTTP
Day 1 - 3 September 2012
Introduction
Brainstorming
- Users perspective
- Interaction & integration NLP/SW
- Open source OS
- Standard for Multilingual Web Sites (MWS)
- Corpora - linked open data (LOD)
- Unification theory
- SW & LR - infrastructure - institutional governance
- Multicultiralism
- Representation of formal languages for multilinguality
- Multiple views on data
- Standards
- Localization of SW resources
- Knowledge in language representation
- Business models for using SW technology
- Scalability of everything
- Cross-lingual links
- Underresource languages
- Benchmarking
- Extension of the LOD by a multilingual layer - language should be a dimension
- Language labeling - Primary Language in HTML
- Size of the data - access to a whole set (1T) and to a record (1K)
- Structure of the data - complex - simple pair
- Language symbol Ϣ - COPTIC CAPITAL LETTER SHEI - U+03E2
-
My domain place:
Multilingual web
and
resources
(corpus, LOD)
Panel 1
Group C: Sebastian + Hans - bootstrapping a translingual web
Kimmo Rossi
Day 2 - 4 September 2012
Reports of Panel 1
- Group C: Sebastian
- Group A: Roberto
- Group B: Josef van Genabith
Panel 2
Group C: multilingual parallel texts
- Raw
- Organised - Multilingual Dataset Format (muset)
- URI
- http://example.com/foo
- http://example.com/foo/de
- TaX - Tabularisation of XML data - tree to table
- Grammatical
Industrial & Application Perspective
Day 3 - 5 September 2012
Reports of Panel 2
-
Group A:
Resources for semantic web
- standards
- OpenAnnotation
-
Group B:
LEMON
-
Group C:
Parallel corpora
Panel 3
Group: Interoperability
Day 4 - 6 September 2012
Reports of Panel 3
-
Collaborative issues.
Cross-domain adaptation of terminologies and ontologies.
Pervasive multimodality and multilinguality of the web (of data)
Gerhard Budin
-
Under-resource languages.
-
Antoine Isaac
Panel 4
Group multilingual web
Reports of Panel 4
Demo
Day 5 - 7 September 2012
Wrap-up
Dimensions
-
Low-hanging fruit
-
Emerging directions
-
Vision
Fields
-
Standardization
-
Scalability
-
Precision-oriented MSW
-
Under-resourced languages
-
Lexicalization of linked data