The Members Church of God International (Philippines) formed the Largest gospel choir ever on Monday, setting the impressive record as part of the 35th anniversary celebration for Ang Dating Daan (The Old Path Religious Program) - Philippines' longest running religious program on television and radio.
A breath-taking 8,688 singers – who travelled from various local churches as well as countries in the Americas and Africa – took their places in the Araneta Coliseum in Manila, Philippines and totally shattered the previous record of 4,745 people that was achieved by by the Iglesia Ni Cristo (Philippines) last year.
Their highest score when using just text features was 75.5%, testing on all the tweets by each author (with a train set of 3.3 million tweets and a test set of about 418,000 tweets). (2012) used SVMlight to classify gender on Nigerian twitter accounts, with tweets in English, with a minimum of 50 tweets.
Their features were hash tags, token unigrams and psychometric measurements provided by the Linguistic Inquiry of Word Count software (LIWC; (Pennebaker et al. Although LIWC appears a very interesting addition, it hardly adds anything to the classification.
Then we describe our experimental data and the evaluation method (Section 3), after which we proceed to describe the various author profiling strategies that we investigated (Section 4). Gender Recognition Gender recognition is a subtask in the general field of authorship recognition and profiling, which has reached maturity in the last decades(for an overview, see e.g. Even so, there are circumstances where outright recognition is not an option, but where one must be content with profiling, i.e.
Then follow the results (Section 5), and Section 6 concludes the paper. For whom we already know that they are an individual person rather than, say, a husband and wife couple or a board of editors for an official Twitterfeed. the identification of author traits like gender, age and geographical background.
We then experimented with several author profiling techniques, namely Support Vector Regression (as provided by LIBSVM; (Chang and Lin 2011)), Linguistic Profiling (LP; (van Halteren 2004)), and Ti MBL (Daelemans et al.
2004), with and without preprocessing the input vectors with Principal Component Analysis (PCA; (Pearson 1901); (Hotelling 1933)).
We also varied the recognition features provided to the techniques, using both character and token n-grams.
In this case, the Twitter profiles of the authors are available, but these consist of freeform text rather than fixed information fields.
And, obviously, it is unknown to which degree the information that is present is true.
In this paper we restrict ourselves to gender recognition, and it is also this aspect we will discuss further in this section.