A critique and complement to “What gender gap in chess?”

The gap between women and men in chess has been a long-standing debate subject. There have been many studies attempting to explain the seemingly large gap between men and women, both in their participation and strength at the top level. This old debate has reopened in the past weeks with an article in Mint attempting to explain the low presence of women at the top level, and a more recent Chessbase article titled “What gender gap in chess?” that has confronted their main claims with a new statistical perspective. While this statistical framing is not entirely new and is largely a replication of the study of Bilalić et al. …

This year I’m teaching a module on Applied Machine Learning with over a hundred students. The students come from very different backgrounds and may not be well prepared in advance to successfully complete this module. To get all students up to speed and level the field, I have compiled a list of online resources on three of the areas which are keys to become a successful Machine Learning practitioner: programming (Python), basic Mathematics and the usage of the command-line and virtual environments. In general, all these online courses and tutorials are suited for those interested in data science and machine learning. Moreover, most of these tutorials are short (i.e. …

Written by Jose Camacho Collados and Taher Pilehvar

Word embeddings are representations of words as low-dimensional vectors, learned by exploiting vast amounts of text corpora. As explained in a previous post, word embeddings (e.g. Word2Vec [1], GloVe [2] or FastText [3]) have proved to be powerful keepers of prior knowledge to be integrated into downstream Natural Language Processing (NLP) applications. However, despite their flexibility and success in capturing semantic properties of words, the effectiveness of word embeddings is generally hampered by an important limitation, known as the meaning conflation deficiency: the inability to discriminate among different meanings of a word.

A word can have one meaning (monosemous) or multiple meanings (ambiguous). For instance, the noun mouse can refer to two different meanings depending on the context: an animal or a computer device. Hence, mouse is said to be ambiguous. According to the Principle of Economical Versatility of Words [4], frequent words tend to have more senses, which can cause practical problems in downstream tasks. Moreover, this meaning conflation has additional negative impacts on accurate semantic modeling, e.g., semantically unrelated words that are similar to different senses of a word are pulled towards each other in the semantic space [5,6]. In our previous example, the two semantically-unrelated words rat and screen are pulled towards each other in the semantic space for their similarities to two different senses of mouse. This, in turn, contributes to the violation of the triangle inequality in euclidean spaces [5,7]. …

Neural networks have contributed to outstanding advancements in fields such as computer vision [1,2] and speech recognition [3]. Lately, they have also started to be integrated in other challenging domains like Natural Language Processing (NLP). But how do neural networks contribute to the advance of text-based applications? In this post I will try to explain, in a very simplified way, how to apply neural networks and integrate word embeddings in text-based applications, and some of the main implicit benefits of using neural networks and word embeddings in NLP.

First, what are word embeddings? Word embeddings are (roughly) dense vector representations of wordforms in which similar words are expected to be close in the vector space. For example, in the figure below, all the big cats (i.e. cheetah, jaguar, panther, tiger and leopard) are really close in the vector space. Word embeddings represent one of the most successful applications of unsupervised learning, mainly due to their generalization power. The construction of these word embeddings varies, but in general a neural language model is trained on a large corpus and the output of the network is used to learn word vectors (e.g. Word2Vec [4]). For those interested in how to build word embeddings and its current challenges, I would recommend a recent survey on this topic [5]. …

As you may probably know, DeepMind has recently published a paper on AlphaZero [1], a system that learns by itself and is able to master games like chess or Shogi.

Before getting into details, let me introduce myself. I am a researcher in the broad field of Artificial Intelligence (AI), specialized in Natural Language Processing. I am also a chess International Master, currently the top player in South Korea although practically inactive for the last few years due to my full-time research position. Given my background I have tried to build a reasoned opinion on the subject as constructive as I could. For obvious reasons, I have focused on chess, although some arguments are general and may be extrapolated to Shogi or Go as well. …

About

Jose Camacho Collados

Mathematician, AI/NLP researcher and chess International Master. http://www.josecamachocollados.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store