Title: | Biplot Graphical Interface for LDA Models |
---|---|
Description: | Contains the development of a tool that provides a web-based graphical user interface (GUI) to perform Biplots representations from a scraping of news from digital newspapers under the Bayesian approach of Latent Dirichlet Assignment (LDA) and machine learning algorithms. Contains LDA methods described by Blei , David M., Andrew Y. Ng and Michael I. Jordan (2003) <https://jmlr.org/papers/volume3/blei03a/blei03a.pdf>, and Biplot methods described by Gabriel K.R(1971) <doi:10.1093/biomet/58.3.453> and Galindo-Villardon P(1986) <https://diarium.usal.es/pgalindo/files/2012/07/Questiio.pdf>. |
Authors: | Luis Pilacuan-Bonete [cre, aut]
|
Maintainer: | Luis Pilacuan-Bonete <[email protected]> |
License: | GPL-3 |
Version: | 0.1.2 |
Built: | 2025-02-28 05:19:45 UTC |
Source: | https://github.com/cran/LDABiplots |
Pearson Correlation for Sparse Matrices.
More memory and time-efficient than cor(as.matrix(x))
.
dtmcorr(x)
dtmcorr(x)
x |
A matrix, potentially a sparse matrix such as a "dgCMatrix" object |
a correlation matrix
Remove terms from a Document-Term-Matrix and documents with no terms based on the term frequency inverse document frequency.
Either giving in the maximum number of terms (argument top
), the tfidf cutoff (argument cutoff
)
or a quantile (argument prob
)
dtmremovetfidf(dtm, top, cutoff, prob, remove_emptydocs = TRUE)
dtmremovetfidf(dtm, top, cutoff, prob, remove_emptydocs = TRUE)
dtm |
an object class "dgCMatrix" |
top |
integer with the number of terms which should be kept as defined by the highest mean tfidf |
cutoff |
numeric cutoff value to keep only terms in |
prob |
numeric quantile indicating to keep only terms in |
remove_emptydocs |
logical indicating to remove documents containing no more terms after the term removal is executed. Defaults to |
a sparse Matrix as returned by sparseMatrix
where terms with high tfidf are kept and documents without any remaining terms are removed
Term Frequency - Inverse Document Frequency calculation. Averaged by each term.
dtmtfidf(dtm)
dtmtfidf(dtm)
dtm |
an object class "dgCMatrix" |
a vector with tfidf values, one for each term in the dtm
matrix
This function performs the representation of GHBiplot (Gabriel,1971).
GHBiplot (X, Transform.Data = 'scale')
GHBiplot (X, Transform.Data = 'scale')
X |
array_like; |
Transform.Data |
character; |
Algorithm used to construct the GH Biplot. The Biplot is obtained as result of the configuration of markers for individuals and markers for variables in a reference system defined by the factorial axes resulting from the Decomposition in Singular Values (DVS).
GHBiplot
returns a list containing the following components:
eigenvalues |
array_like; |
explvar |
array_like; |
loadings |
array_like; |
coord_ind |
array_like; |
coord_var |
array_like; |
Gabriel, K. R. (1971). The Biplot graphic display of matrices with applications to principal components analysis. Biometrika, 58(3), 453-467.
GHBiplot(mtcars)
GHBiplot(mtcars)
This function performs the representation of HJ Biplot (Galindo, 1986).
HJBiplot (X, Transform.Data = 'scale')
HJBiplot (X, Transform.Data = 'scale')
X |
array_like; |
Transform.Data |
character; |
Algorithm used to construct the HJ Biplot. The Biplot is obtained as result of the configuration of markers for individuals and markers for variables in a reference system defined by the factorial axes resulting from the Decomposition in Singular Values (DVS).
HJBiplot
returns a list containing the following components:
eigenvalues |
array_like; |
explvar |
array_like; |
loadings |
array_like; |
coord_ind |
array_like; |
coord_var |
array_like; |
Gabriel, K. R. (1971). The Biplot graphic display of matrices with applications to principal components analysis. Biometrika, 58(3), 453-467.
Galindo-Villardon, P. (1986). Una alternativa de representacion simultanea: HJ-Biplot (An alternative of simultaneous representation: HJ-Biplot). Questiio, 10, 13-23.
HJBiplot(mtcars)
HJBiplot(mtcars)
This function performs the representation of JK Biplot (Gabriel,1971).
JKBiplot (X, Transform.Data = 'scale')
JKBiplot (X, Transform.Data = 'scale')
X |
array_like; |
Transform.Data |
character; |
Algorithm used to construct the JK Biplot. The Biplot is obtained as result of the configuration of markers for individuals and markers for variables in a reference system defined by the factorial axes resulting from the Decomposition in Singular Values (DVS).
JKBiplot
returns a list containing the following components:
eigenvalues |
array_like; |
explvar |
array_like; |
loadings |
array_like; |
coord_ind |
array_like; |
coord_var |
array_like; |
Gabriel, K. R. (1971). The Biplot graphic display of matrices with applications to principal components analysis. Biometrika, 58(3), 453-467.
JKBiplot(mtcars)
JKBiplot(mtcars)
Plot_Biplot
initializes a ggplot2-based visualization of the caracteristics presented in the data analized by the Biplot selected.
Plot_Biplot(X, axis = c(1,2), hide = "none", labels = "auto", ind.shape = 19, ind.color = "red", ind.size = 2, ind.label = FALSE, ind.label.size = 4, var.color = "black", var.size = 0.5, var.label = TRUE, var.label.size = 4, var.label.angle = FALSE)
Plot_Biplot(X, axis = c(1,2), hide = "none", labels = "auto", ind.shape = 19, ind.color = "red", ind.size = 2, ind.label = FALSE, ind.label.size = 4, var.color = "black", var.size = 0.5, var.label = TRUE, var.label.size = 4, var.label.angle = FALSE)
X |
List containing the output of one of the functions of the package. |
axis |
Vector with lenght 2 which contains the axis ploted in x and y axis. |
hide |
Vector specifying the elements to be hidden on the plot. Default value is “none”. Other allowed values are “ind” and “var”. |
labels |
It indicates the label for points. If it is "auto" the labels are the row names of the coordinates of individuals. If it isn't auto it would be a vector containing the labels. |
ind.shape |
Points shape. It can be a number to indicate the shape of all the points or a factor to indicate different shapes. |
ind.color |
Points colors. It can be a character indicating the color of all the points or a factor to use different colors. |
ind.size |
Size of points. |
ind.label |
Logical value, if it is TRUE it prints the name for each row of X. If it is FALSE (default) does not print the names. |
ind.label.size |
Numeric value indicating the size of the labels of points. |
var.color |
Character indicating the color of the arrows. |
var.size |
Size of arrow. |
var.label |
Logical value, if it is TRUE (default) it prints the name for each column of X. If it is FALSE does not print the names. |
var.label.size |
Numeric value indicating the size of the labels of variables. |
var.label.angle |
Logical value, if it it TRUE (default) it print the vector names with orentation of the angle of the vector. If it is FALSE the angle of all tags is 0. |
Return a ggplot2
object.
hj.biplot <- HJBiplot(mtcars) Plot_Biplot(hj.biplot, ind.label = TRUE)
hj.biplot <- HJBiplot(mtcars) Plot_Biplot(hj.biplot, ind.label = TRUE)
Shiny UI for LDABiplots package
runLDABiplots(host = "127.0.0.1", port = NULL, launch.browser = TRUE)
runLDABiplots(host = "127.0.0.1", port = NULL, launch.browser = TRUE)
host |
The IPv4 address that the application should listen on. Defaults to the shiny.host option, if set, or "127.0.0.1" if not. |
port |
is the TCP port that the application should listen on. If the port is not specified, and the shiny.port option is set (with options(shiny.port = XX)), then that port will be used. Otherwise, use a random port. |
launch.browser |
If true, the system's default web browser will be launched automatically after the app is started. Defaults to true in interactive sessions only. This value of this parameter can also be a function to call with the application's URL. |
No return value
if(interactive()){ runLDABiplots() }
if(interactive()){ runLDABiplots() }