Presentaremos distintos métodos de estimación no-paramétrica para la densidad con soporte compacto o dominio con geometría que puede ser compleja. Veremos como se fue afectado la calidad de estimadores clásicos al borde del dominio. En particular se propondrá un nuevo estimador basados en polinomios locales para estimar la densidad en un punto x. Veremos que este estimador puede adaptarse a distintas geometrías y que tiene propiedades optímales en términos del error cuadrático medio y de la regularidad de la función a estimar. Compararemos este método al método sparr que es una alternativa popular para la estimación de densidad en dominios a geometría complicada.
Structural equation models aim to represent and describe relationships between constructs, and between constructs and observed variables, whereas multiblock data analysis focuses on explaining the relationships between several blocks of variables. Multiblock data analysis enables the creation of latent variable scores and the estimation of structural equation models. A general framework is provided by Regularized Generalized Canonical Correlation Analysis (RGCCA). In this talk, I present application examples to illustrate a context for understanding the fundamental concepts of both fields and their interconnections. I review the main definitions related to RGCCA, the optimization problem, the search algorithm, and special cases. Further research is outlined.
We introduce two new approaches to clustering categorical and mixed data: Condorcet clustering with a fixed number of groups, denoted $$\alpha$$-Condorcet and Mixed-Condorcet respectively. As k-modes, this approach is essentially based on similarity and dissimilarity measures. The presentation is divided into three parts: first, we propose a new Condorcet criterion, with a fixed number of groups (to select cases into clusters). In the second part, we propose a heuristic algorithm to carry out the task. In the third part, we compare $$\alpha$$ -Condorcet clustering with k-modes clustering and Mixed-Condorcet with k-prototypes. The comparison is made with a quality’s index, accuracy of a measurement, and a within-cluster sum-of-squares index.
Our findings are illustrated using real datasets: the feline dataset, the US Census 1990 dataset and other data.