LAIR: A Language for Automated Semantics-Aware Text Sanitization based on Frame Semantics
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
LAIR: A Language for Automated Semantics-Aware Text Sanitization based on Frame Semantics. / Hedegaard, Steffen; Houen, Søren; Simonsen, Jakob Grue.
Proceedings of the 3rd IEEE International Conference on Semantic Computing (ICSC 2009). IEEE Computer Society Press, 2009. p. 47-52.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - LAIR: A Language for Automated Semantics-Aware Text Sanitization based on Frame Semantics
AU - Hedegaard, Steffen
AU - Houen, Søren
AU - Simonsen, Jakob Grue
N1 - Conference code: 3
PY - 2009
Y1 - 2009
N2 - We present \lair{}: A domain-specific language that enables users to specify actions to be taken upon meeting specific semantic frames in a text, in particular to rephrase and redact the textual content. While \lair{} presupposes superficial knowledge of frames and frame semantics, it requires only limited prior programming experience. It neither contain scripting or I/O primitives, nor does it contain general loop constructions and is not Turing-complete. We have implemented a \lair{} compiler and integrated it in a pipeline for automated redaction of web pages. We detail our experience with automated redaction of web pages for subjectively undesirable content; initial experiments suggest that using a small language based on semantic recognition of undesirable terms can be highly useful as a supplement to traditional methods of text sanitization.
AB - We present \lair{}: A domain-specific language that enables users to specify actions to be taken upon meeting specific semantic frames in a text, in particular to rephrase and redact the textual content. While \lair{} presupposes superficial knowledge of frames and frame semantics, it requires only limited prior programming experience. It neither contain scripting or I/O primitives, nor does it contain general loop constructions and is not Turing-complete. We have implemented a \lair{} compiler and integrated it in a pipeline for automated redaction of web pages. We detail our experience with automated redaction of web pages for subjectively undesirable content; initial experiments suggest that using a small language based on semantic recognition of undesirable terms can be highly useful as a supplement to traditional methods of text sanitization.
U2 - 10.1109/ICSC.2009.79
DO - 10.1109/ICSC.2009.79
M3 - Article in proceedings
SN - 978-0-7695-3800-6
SP - 47
EP - 52
BT - Proceedings of the 3rd IEEE International Conference on Semantic Computing (ICSC 2009)
PB - IEEE Computer Society Press
Y2 - 14 September 2009 through 16 September 2009
ER -
ID: 16239403