Semantic retrieval from video databases is becoming a very important research topic in the area of multimedia. This kind of tasks require the development of video data representation models which include the relationships between low-level visual cues and the semantic concepts inferred from them. This paper presents a work based on semiotic studies that includes the extraction of simple visual features from commercials and a statistical analysis of them and their relationships with high-level semantic terms. Well-known algorithms have been implemented and enhanced for feature extraction, as well as a novel probabilistic approach to color naming. The statistical analysis consists of finding correlations between variables, as well as the dimensions in feature space that best explain the variance of the data set. Some interesting conclusions are reached at the end of the work about how commercials are grouped in feature space with respect to different levels of semantics. © 2000 Academic Press.