Publication
Some authors have recently argued that a finite-size scaling law for the text-length dependence of word-frequency distributions cannot be conceptually valid. Here we give solid quantitative evidence for the validity of this scaling law, using both careful statistical tests and analytical arguments based on the generalized central-limit theorem applied to the moments of the distribution (and obtaining a novel derivation of Heaps’ law as a by-product).
We also find that the picture of word-frequency distributions with power-law exponents that decrease with text length [X. Yan and P. Minnhagen, Physica A 444, 828 (2016)PHYADX0378-437110.1016/j.physa.2015.10.082] does not stand with rigorous statistical analysis. Instead, we show that the distributions are perfectly described by power-law tails with stable exponents, whose values are close to 2, in agreement with the classical Zipf’s law. Some misconceptions about scaling are also clarified.
Á, Corral, F. Font-Clos, Dependence of exponents on text length versus finite-size scaling for word-frequency distributions, Phys Rev E. 96(2-1) (2017) 022318
Related
Signup
Cookie | Duration | Description |
---|---|---|
cookielawinfo-checkbox-analytics | 1 year | Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Analytics" category. |
cookielawinfo-checkbox-functional | 1 year | The GDPR Cookie Consent plugin sets the cookie to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 1 year | Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Necessary" category. |
CookieLawInfoConsent | 1 year | CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie. |
PHPSESSID | session | This cookie is native to PHP applications. The cookie stores and identifies a user's unique session ID to manage user sessions on the website. The cookie is a session cookie and will be deleted when all the browser windows are closed. |
viewed_cookie_policy | 1 year | The GDPR Cookie Consent plugin sets the cookie to store whether or not the user has consented to use cookies. It does not store any personal data. |
Cookie | Duration | Description |
---|---|---|
mec_cart | 1 month | Provides functionality for our ticket shop |
VISITOR_INFO1_LIVE | 6 months | YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface. |
VISITOR_PRIVACY_METADATA | 6 months | YouTube sets this cookie to store the user's cookie consent state for the current domain. |
YSC | session | Youtube sets this cookie to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt-remote-device-id | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt.innertube::nextId | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
Cookie | Duration | Description |
---|---|---|
_ga | 1 year | Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors. |
_ga_* | 1 year | Google Analytics sets this cookie to store and count page views. |
_gat_gtag_UA_* | 1 min | Google Analytics sets this cookie to store a unique user ID. |
_gid | 1 day | Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously. |