Showing posts with label vocabulary. Show all posts
Showing posts with label vocabulary. Show all posts

Monday, March 3, 2014

The nature of the lexicon

So last semester when I was as busy as you can get, I wrote a paper. I think it's kind of interesting. The idea here is that I'm looking to do my thesis on the nature of word frequency in people's heads. This paper is a bit of foundation for the differences between L1 and L2 speakers and their mental lexicons. This paper will probably be the back third of the literature review section.


Saturday, June 22, 2013

New words for me

So I had to learn 100 new English words for the lexical acquisition class I took. Here they are, warts and all.

English
  1. polysemous – adj – about word that has several meanings (e.g. bank1, bank2)
  2. tumbleblog – n – a multimedia blog that isn't as text heavy as a more prototypical blog, a blog in the style of tumblr.
  3. fameball – n – "a derogatory term for someone who has an unquenchable desire for fame" from nytimes.com
  4. magnetar – n – neutron star with an extremely powerful magnetic field, the decay of which powers the emission of high-energy electromagnetic radiation, particularly X-rays and gamma rays
  5. social steganography – n – the practice of users of social media, often teens, to cloak messages with double meaning. (A status update of "Always look on the bright side of life" seems innocuous, unless you know the context of the song and that the author of the comment is making allusion to it. It allows for outsiders to see one reading and insiders to see another. Example from this paper.)
  6. bibelot – n – tschochke, geegaw, but with the overtones that explicitly French borrowings bring to English
  7. postern – n – a door or gate that isn't the primary door or gate
  8. proddie – n – a protestant, use is derogatory and Irish
  9. saccade – n – the jerking motion of the eye across the field of vision
  10. swag – n (possibly adj too) – style, Urban Dictionary has a lot of interesting info on this word, which in some communities may have positive connotations and negative connotations in others. An are-you-joking origin: an acronym for "Secretly We Are Gay", but UD suggests a much more plausible origin in Scots English. More likely origin: an abbreviation of swagger.
  11. incel – adj (also n) – involuntarily celibate
  12. teknonym – n – a name taken from a child (e.g. Mahmoud Abbas, chairman of the PLO, is also known as Abu Mazen, which refers to his first son Mazen)
  13. Einzelgänger – n – the opposite of a doppelgänger, unique
  14. ambilingual – adj – a fully fluent and balanced bilingual, the Platonic ideal of bilingualism
  15. arseward(s) – adj – perverse (obsolete), screwed up; I had assumed that back asswards was some sort of play on ass backwards, but no.
  16. intersubjective – adj – occurring between two minds (e.g. When two people use a word, it is intersubjective because they both know the word and agree on its meaning.)
  17. moil – v – to work hard; often collocated with toil – toil and m.
  18. progymnasmata – n – a curriculum of rhetorical development in antiquity
  19. phatic – adj – words that pertain to social relationships rather than their strict meaning
  20. decolletage – n – a low cut neckline in women's fashion
  21. poorism – n – tourism in poverty stricken areas (rhymes with tourism); probably a journalism word as I heard it on the radio (NPR, Worldview, 4 June 2013)
  22. binge watch – v – watching a whole season or series of movies in a very short time, possibly even one sitting: synonym for marathon, though I'm not sure what the exact difference might be
  23. crepitus – n – medical term for cracking knuckles (or any other joint really)
  24. survivorship – n – the state of being a survivor (thanks Wiktionary, I'd have never guessed that)
  25. stanchion – n – an upright bar or post that provides support to something else; the vertical bar in a a leadlight
  26. leadlight – n – a decorative window made with small panes of glass separated by metal bars (though I'm not sure how this is different from stained glass except maybe there's no color in the glass of a leadlight?)
  27. raceway – n – an electrical conduit that is a decorative element rather than hidden behind a structure
  28. spiv – n – a criminal (specifically a con man or black marketeer) who dresses in a flashy manner; probably obsolete
  29. paresthesia – n – pins and needles feeling in a numbed or "asleep" body part
  30. doodlesack – n – bagpipes; possibly obsolete
  31. ergodic – adj – applies to dynamic systems whose average behavior is the same over time when compared to average phase states; applies to Markov chains and thermodynamics
  32. comedo – n – a blackhead
  33. be spoiled – v – to have spoilers told to you
  34. bogie – n – the wheel unit of a train, typically with two axles
  35. journal – n – the part of the axle that lies on bearings
  36. dinkum – n – hard work (regional to Derbyshire)
  37. dinkum – adj – good, excellent, honest (regionalish to Australia)
  38. ology – n – a science; a backformation, its use probably indicates low social status on the speaker's part
  39. pinion – – gear type either within another gear with cogs on the inside or the gear that meshes with the rack
  40. rack – n – a bar with cogs, as if it were a gear built flat instead of round
  41. threequel – n – a second sequel, probably in distinction to part three of a trilogy 
  42. bargainous – adj – an outstanding deal, opposite of spendy
  43. dee – n – police Detectives
  44. slashdot – v – to overwhelm with messages
  45. half handle – n – one of the two pieces of a knife handle that are not the metal part of the knife
  46. tang – n – the metal part of the knife between the two half handles
  47. bolster – n – the thickened metal part of a knife just past the handle
  48. heel – n – the portion of a knife's blade that extends below the handle
  49. guard – n – the taper on the metal between the blade and bolster
  50. back – n – the edge opposite the cutting edge of the knife
  51. skimmer – n – a kitchen utensil with a handle and a round, perforated dish on the end that is used to take food out of liquid or skim things off the top of soups
  52. slot – n – the space between the tines of a fork
  53. demitasse – n – a small cup for coffee, usually used for espresso
  54. ramekin – n – an oven to table dish that is used for individual portions
  55. cork ball – n – the inner cork portion of a baseball
  56. yarn ball – – the outer wound portion of a baseball
  57. cover – n – the leather outside of a baseball
  58. shrift – n – the act of confession, related to shrive
  59. emulous – adj – ambitious, though possibly without the negative connotations (hard to say what connotations ambitious had when RL Stevenson was writing)
  60. coquetry – n – effort to attract attention, often directed from a woman to a man
  61. risk – v – to take a risk (heard in a live interview on the radio, is this a nonce form caused by the ease of zero derivation in English?)
  62. chuff – v – to make a noisy puffing sound like a steam engine (heard every day thanks to Thomas and Friends, defined thanks to class) Source: oh, the horror, http://ttte.wikia.com/wiki/Roll_Along
  63. chuff – v – to deliberately and obviously fail a standardized test
  64. chuffed – adj – pleased
  65. shunt – v – to move a train from one track to another or move cars from one train to another (again courtesy of Thomas and Friends)
  66. head – n – the flared load-bearing part of a railway rail
  67. web – n – the thin(ner) part of a rail between the base and head
  68. fishplate – n – the piece that joins two railway rails together
  69. safety line – n – the textured and colored strip at the edge of a train station platform
  70. trough – n – long, narrow area of low pressure, named because of its shape
  71. cornice – n – portion of roof that overhangs the main structure for rain protection
  72. pilaster – n – rectangular column that sticks out of a wall but is structurally insignificant
  73. fore edge – n – the edge of a book opposite the binding
  74. square – n – part of book board that overhangs the block; a book cornice
  75. action – n – the part of a piano that transfers motion between the key and hammer; possibly generalized to (musical) keyboard function
  76. nacelle – n – /ˈnæsl̩/ at M-W, /nəˈsɛl/ at Wiktionary – the boxy part of a windmill between the hub and the mast (and just what is the difference between a mast and stanchion?)
  77. swart – adj – dark; related to swarthy
  78. percy – adj – personal (of a drug stash), which has led to a meaning of unreal or legit
  79. bpw;dr – initialism – behind pay wall; didn't read
  80. avidity – n – greed, intensity of desire
  81. catholicity – n – universality; appears to relate to catholic and Catholic
  82. cheval-glass – n – a full length mirror on a stand that allows the mirror to pivot
  83. arras – n – tapestry, since Arras (a city in France) was a major source of them
  84. tippet – n – a shoulder covering garment, an animal skin draped on the shoulders is a typical example, a scarf worn loose around the neck is another example
  85. bartizan – n – (also bartisan) a wall with projecting battlements like the Spanish fortifications in Cádiz or San Juan (though the word is Scottish in origin)
  86. retronym – n – a term modified to make the original sense explicit as in acoustic guitar, natural turf or white milk; apparently coined by Frank Mankiewicz
  87. annuitant – n – someone who gets an annuity
  88. absquatulate – v – to run off; a joke coinage from the 19th century
  89. hangry – adj – anger caused by hunger; a joke portmanteau of the 21st century (not the only portmanteau of hungry I've seen, just that this one was new)
  90. nyctophilia – n – a love of the night
  91. cat vacuum – v – to be doing something other than the writing you ought to be doing (like collecting words rather than preparing a book I ought to be)
  92. adiabatic – adj – involving no heat transfer into or out of the working fluid of thermodynamic processes; highly associated with process
  93. hegemony – adj – a notion of a power structure involving domination of one group over another; more narrowly a power structure that is not questioned—how Things Are The Way They Are and why they stay that way
  94. memristor – n – passive electrical element (resistor, capacitor and inductor are the other three) caused by imperfect metal-metal contact; portmanteau of memory resistor
  95. coherer – n – radio detector that predates crystal detector; some are memristors
  96. articulated bus – n – one of the really long busses with two halves
  97. slug – v – to carpool informally but not as informal as hitchhiking;, usually to take advantage of HOV lanes (similar to the expresses on the Kennedy, but for cars with multiple passengers) or lower tolls
  98. slug – n – the person who gets a slugged ride; not the driver
  99. tender – n – another name for coal car
  100. tank engine – n – a steam engine that does not have a tender, but carries all fuel and water on-board

Wednesday, June 19, 2013

L2 tactics on L1 vocabulary learning

Since one of the projects in the vocabulary acquisition class is to learn 100 new English words, I figured I could take a page out of L2 teaching. There are some ways of teaching that include pre-reading vocabulary instruction. Why not do the same for L1? Sure, I've got an adult-sized vocabulary, but why make this harder than necessary?

But where to find the words I don't know? Enter the Simple Concordance Program teamed up with Jekyll and Hyde. One of the SCP's tricks is that it can generate a word list by frequency of the word's use. And Jekyll and Hyde is in the public domain, so its text is already txt. So team them up to make a list showing from least frequent (one use for a whole bunch) to most frequent (1,600 uses for the). Scan the words that occur once for likely targets, and you've now got a list of words to learn. 

Just. Like. That.

(I should add that this doesn't add to vocabulary depth or catch all of the likely targets, but it speeds things up quite a bit.)

Tuesday, June 11, 2013

Word frequency lists

One of the things a Latin nerd is up against is poor word frequency lists. In English, there are lots of good lists. The frequency data from COCA is top notch.

If you look at their big list, you'll see that somehow they manage to deal with inflected forms: be, which is English's most inflected word, is in second place. Properly so. But then you go to a pre-cooked Latin frequency list and the various forms of esse are scattered. And to a degree I understand. It's easier to build a frequency list that ignores these sorts of things. Quo could belong to quo or qui/quae/quod or quis/quid. I suppose there are ways around it, but then you start getting into having to program a computer to know the difference. I don't want to think about teaching a computer the difference between cum1 and cum2. But to some degree that's small potatoes.

Perseus has a word frequency tool buried in the results page of the word study tool, and it's pretty cool. But as a frequency analysis for fax shows, it's got an idiosyncratic approach to defining the corpus (i.e. de senectute is its own corpus and so is epistulae ad familiares and so on and so on). So at Perseus you get an idea of frequency, so long as you're not interested in a broader vision of Latinity. Other lists give you an absolute ranking and no more. Some give you the lemma others give you the assorted word forms. And then there's a super list that I love (it's true) from Dickinson College Commentaries.

In any case, I've not found one that's good at tracking down collocations—its own can of worms. Oh, woe to someone whose interest in Latin goes beyond the literary, historical or pedagogical. 

Sunday, June 2, 2013

L2 vocab project: Persian

I've settled on Persian for my non-English language for the vocabulary acquisition class. It's a language I'm fascinated by for a host of reasons. It's Indo-European and has a very long history—and I'm a sucker for history like that. And while modern Persian isn't its ancient counterpart, I'm also fascinated with the interplay between Iran and the West, which has been going on for years. And I'd like to go there, because it looks way cool. Alas, money is the biggest obstacle. Though the state based in Washington doesn't think it should be easy for me to go there either (and I'm not entirely sure that the state based in Tehran feels much differently).

So I'll get a little vicarious. As usual, I'll be blogging my way through this mess. In order to keep things from getting off track here, I'm setting up a blog here. To help anchor the words in the culture of Iran, I'm going to pick out proverbs. For one, you can't separate a language from its culture. For another, one of the aspects of a language that has been hinted at in class is set blocks of speech. And I really hope to be able to tell you more about formulaic speech in upcoming entries.

Tuesday, May 28, 2013

Building a crane to the vocabulary spurt

In the third chapter of Becoming a Word Learner, Linda Smith talks about how children build a crane (her term) to make themselves better word learners.

The notion is that when children learn their first words, they start to notice patterns about those words and create templates for future learning. Her main point is that children seem to fix on shape, rather than some other property, to signal an object's class. For example, chairs—prototypically anyway—have four legs, a seat and a back. This shape cues children in that a CHAIR is a chair. What's interesting is that when researchers cue very young children in on the shape bias by training them, their vocabularies grow faster.

So whatever the exact mechanism may be, children are learning how to learn words by—and this is truly shocking—learning words. Once they get to a certain point, the biases and patterns they've developed seem to take on a life of their own.

How might this relate to learning a second language? I'm not wholly sure, but allow me some speculation. One of the things that foreign language learning materials seem to focus on is inflectional morphology, which makes enough sense. You can't speak the language if you don't know how speakers expect things to be ordered. Latin wants case inflection on nouns. English wants word order. Spanish wants you to be clear about which object you are talking about via definite and indefinite articles. Russian couldn't care. And so on.

But one thing that foreign language materials, so far as I've seen anyway, don't worry too much about is derivational morphology. How do you get from civil to civility? And why can't you go from polite to politity? I'll be reading a paper—and thus blogging about it later—about this subject exactly.  I could be wrong, but I suspect that adult learners are given vocabulary lists that they then create a derivational morphology from. Or at least that's how it felt to me when I was learning Latin all those years ago. Civis became civitas. Aestus became aetstas. Could moralis become mortalitas? And the connection is made, though not without flaws. I think it was then that my grasp on Latin vocabulary started to really firm up from a list of words to memorize to things that behaved in similar way. In other words, I had made a derivational morphology crane for myself. 

Thursday, May 23, 2013

Icon, check. Index, check. Symbol, um.

So we're reading Becoming a Word Learner for class. And so far it's striking me as an extension of First Language Acquisition—everyone's got a different model.

The first chapter is "Word Learning: Icon, Index or Symbol?", which seems like a good place to start the discussion about learning words. After all, you need to show what it is that people are doing when they learn words. The way the authors, Golinkoff and Hirsh-Pasek, do that is by looking at attempts to teach non-humans human languages. (They say infrahuman, but I don't like the term. It smacks of chain of being, which I detest.) And to get to a human understanding of language, you need to be in possession of what they call symbol. Animals can, for the most part, only manage icon and index. But the part that bothers me is that I can't quite draw the distinction between index and symbol.

An icon of fire
Icon seems pretty straightforward. An icon is a representation of the thing itself. Here's an icon of fire.

See? It's not a fire, but it looks like fire. There seems to be some dispute as to just how much resemblance is necessary, but I'm going to ignore that.








An index of fire
The next remove is index. An index is either something that is correlated with or points to something. So for fire, smoke is an index. Other indexes of fire might be: heat, wood, camping, cooking, matches. So here's a picture of smoke, which is an index of fire.
One symbol for fire
(according to Wiktionary)
The problem comes in with symbol. At this page (a somewhat less in-depth discussion than in G and H-P), symbols are "easily removed from context" and "associated with large sets of other words". Ok, so far so good. I can talk about fire with none being present, as well as knowing that it as an association with other words like smoke, heat, wood, camping, cooking, and matches.
Here's the problem, which strikes me as a father of young children. We talk a lot about the here and now at home, which means that we are talking about things that are not removed from context—particularly with my son (2;5). My daughter has made the leap to things that aren't present, i.e. her upcoming birthday party. So we're kind of defeating the benefit of a symbol. In fact, we're treating words like indexes. We don't say MILK unless there is milk somewhere nearby: or we are trying to get the milk from the fridge into a cup or something very concrete. The other thing is that while we are indexing MILK to milk, we are also indexing it to such things as cups, lunch, cold, cereal, spoons, fridge and the like. So we're somewhat taking advantage of the association with other words, but they too are indexed in the here and now. 

Anyway. What I'm trying to get at is that I'm not seeing a clear line between index and symbol. Maybe at the ends of the index/symbol spectrum of goodness, it's clear. (Oooo, could it be a spectrum relationship?) Maybe as a child's ability to use language apart from the here and now develops, the child develops cognitive ability to make symbols out of indexes.

But there seems to be a lot of messy could-go-this-way or could-go-that-way and begging the question involved with indexes and symbols. If words are symbols, why are they so indexy early on? If words start their mental existence as indexes, then what transforms them into symbols? Do we even need to draw a distinction between index and symbol other than to say that a symbol is an index plus displacement? Or is this just another way that we're trying to separate man from beast without pointing at the actual neurological difference between what humans and, say, language-trained chimps are doing? I don't know. As usual here, more questions than answers. I absolutely promise interesting tools for learning new words in a second language before the end of summer. Cross my heart.

If you're curious, you can browse the book here. Why not? It's pretty interesting so far.

Sunday, May 19, 2013

Lexical acquisition

I start my first (and hopefully only) summer course tomorrow. Lexical acquisition. One of the projects will be a snap. Learn new words in a second language. I doubt that's all there will be, but it seems pretty easy. Find a clutch of words I don't know. If I'm feeling ambitious, which I haven't been lately, I'll pick Persian. If I'm not, I'll pick Greek.

The problem is that the other side of the project is finding new English words. The problem, so far I can see it from before the class starts is that I don't know what words I hear that I don't know. The context of stuff is usually pretty obvious. In reading, I'll have context, but I'll also have that "hm, I don't think I know that one" sensation. The other problem is that if anyone is using unusual words in my life, it's me.

I hope I can blog a lot about the stuff I learn in this class, because it sounds fascinating to me. Thus it will get inflicted on you, dear readers.

Wednesday, January 2, 2013

Dickinson College Commentaries

Ok, if you have a sick love of Latin the way I do, you need to know about the Dickinson College Commentaries. Especially their vocabulary list. Especially that. To that end, I've made a spreadsheet version available for download and thus offline access.

Since discovering it, I've made good use of it in student materials. It also provides a manageable list for (high-school level) students to master over their two years of introductory courses. The DCC blog says this about the list:
The Latin list contains about 1000 of the most common words in Latin. These are the lemmas or dictionary headwords that generate approximately 70% of the word forms in a typical Latin text.
Mind you, 70% is not enough to get fluent reading going on, but it's a good start. I've seen a video that shows that 95% coverage is needed for a student to guess at unknown words. So the DCC list, in conjunction with same page vocabulary support, is a good starting point for students to build their vocabulary.

Given that I've written a three-year curriculum for the younger students at the school I work at, I should probably give a look-see at my vocabulary list and see how it matches up. My gut feeling is that in some ways it matches up pretty well, but in others it doesn't. 

Tuesday, November 22, 2011

2 Greek 2 Quit

Thank you Prof. Major. I didn't know these documents existed, but now I do.


The 50% list is really short. English needs just over 100 lemmata to hit the 50% of text mark. Greek needs 65 (according to Major, and I have no reason to doubt him). Greek hauls in at about 1,200 words for the 80% list. English doesn't get to 80% until about 2,000. No matter how you slice that, Ancient Greek is easier than English on the vocabulary front. I had suspected this, but wasn't sure. 

Not that 80% is all that hot: you need to be at about 95% coverage in a text to be able to guess successfully what the unknown word might mean. This video is an excellent demonstration. Jump to 19:00 or so for the sickening demonstration he performs. The 90% coverage paragraph has words in it that I can't guess, and I'm a native English speaker. It's really shocking.

Next: Major provides a paper on pedagogy. He seems to be saying that a lot of what we do in teaching Ancient Greek is colored by two things. First, we expect that we can go from zero to grad school in about two semesters. Second, Latin's idiosyncrasies color how we teach Greek. Ut triggers subjunctive and the vast mess that subjunctive involves in Latin, so ἵνα must merit the same attention. Right? Major says not so much. He seems to be on the verge of making some general rules for learning Ancient Greek, but stops short. Too bad. Even so, it provides some context for what I found over the summer.