Go语言的日期格式化问题

Go语言的日期格式化是非常特别的(奇葩),不过也是很聪明的。

  • 1 月
  • 2 日
  • 3 时
  • 4 分
  • 5 秒
  • 6 年
  • 7 时区

你可以这样格式化一个时间,time.Format(d, "2006-01-02 15:04:05Z07")。我的一位同事写了这样的代码time.Format(d, "2006-01-02 15:04:02"),发现哪里错了码?他把05写成了02,结果导致秒数始终不对。对于这个格式化API,也只能是醉了。

如何在Keynote中高亮代码

要想在keynote中粘贴高亮的代码,你必须粘贴一个RTF的文本。Mac上有两种办法来复制一个高亮的RTF文本。

方法一:highlight

1
2
3
4
brew install highlight
pbpaste | highlight -O rtf -S js | pbcopy
cat xxx.js | highlight -O rtf -S js | pbcopy

如果你了解Automator的话,你可以创建一个service,自动的完成这个过程。如下图,我选中一段文本,然后选择 services,然后选择我自己编写的highlight-rtf service,这段文本就会被高亮了。

Highlight Service

如果有人对这个技术很感兴趣(转发量大)的话,我会贴出这个service的详细做法,以及我编写好的脚本。

##方法二:vim plugin CopyRTF

但是有些时候highlight不一定支持你的语言。这时你可以使用vim插件来完成,如果你使用vim编写代码的话,你一定已经支持了相应语言的高亮。

安装Vim 插件 CopyRTF。然后你只需要 :CopyRTF 即可复制当前文件或选中的文本到剪切板了。

另外推荐一个样式,PaperColor,你可以自定义背景色。

1
2
let g:PaperColor_Dark_Override = { 'background' : '#000000'}
let g:PaperColor_Light_Override = { 'background' : '#ffffff'}

《内容营销手册》笔记

The Content Marketing Handbook Summary Notes

For questions please email: info@priceonomics.com

Chapter I: Getting Started

  1. Don’t “play­act” and emulate what you think content marketers are supposed to do.
  2. Your content needs to be great, because it’s competing with the best media on the
    Internet.
  3. Your focus should be to create a process that produces a favorable outcome when
    people at your company expend effort at making content.
  4. Find the overlapping area between great content and content that helps your company.
  5. Information is the single best thing to write about.
  6. Information = data, industry, people.
  7. This Handbook is about the process of creating and spreading great content.

Chapter II: Information Marketing

  1. Public relations (PR) is a negotiation between you and a journalist. You are asking for his or her time; what are you offering?
  2. Help journalists do their job, which is to create interesting articles that reach many
    people.
  3. Providing information that is unique, authentic, and interesting is currency.
  4. Information promotes your company and helps journalists write great things.
  5. OKCupid made company data interesting in the modern era and proved that it can really spread.
  6. Time and time again, data­driven reports by companies have spread like wildfire.
  7. To do this job, you have to believe that every company has interesting information.
  8. Beyond data, strive to understand your industry and market. This information is valuable and interesting to some people.
  9. Everyone has an interesting story to tell.
  10. Everything can be interesting ­­ even office supplies.

Chapter III: The World Is Flat (The Level Playing Field)

  1. Increasingly, it does not matter where good content is published. Content can be
    successful anywhere, whether it’s on The New York Times’ website, Medium, or your
    company’s blog.
  2. Every outlet, even The New York Times, gets its traffic from third party platforms like Reddit, Twitter, and Facebook that control most of the audience.
  3. Because of these platforms, any site can make an article go viral, break news, or be an instant authority if it knows what it is doing.
  4. Not only do you have to create excellent content, but you need to think about where your content will spread.
  5. Content channels (journalists, social news sites, etc.) are “supernodes” for reaching larger audiences.
  6. Find the “supernode” site where your content fits in best and genuinely participate in the
    community to find out what’s popular.
  7. As you write content, think about which channels it will be popular on. If you can’t think of one, it probably won’t be popular.
  8. These supernode sites give your content “The Bump” ­­ a burst of traffic that allows your content to reach its intended audience.
  9. Eventually, if you keep at it, this outbound process of spreading content will become easier and people will share it on their own. Then it becomes an inbound process.
  10. Even if you write something great, there is a fine line between success and failure. You have to wage a campaign and exert effort to make something spread.

Chapter IV: Social Networks

  1. Facebook sends a lot of traffic, but typically only after your content has gotten “The Bump” from a supernode.
  2. After “The Bump,” most of your traffic will come from Facebook. When you hit a homerun, Facebook really piles on the traffic.
  3. Facebook will broadcast anything if a sufficient number of people share it, which is great for small blogs that traditional broadcasters ignore. Ask yourself, “Why do people share?”
  4. People share as a form of self­expression. Ask yourself, “What does the act of sharing this content say about me?”
  5. Think about how someone will feel when they share your content. Is it a feeling they want? Is it a feeling you want your brand associated with?
  6. Is your content digestible? Is it easily condensed into a Tweet or a Facebook status?
  7. People don’t share something just because it’s good. They need something specific to say about it.
  8. You should organize your data in a way that gives people something to say about it.
  9. Facebook = Traffic. Twitter = Where journalists hang out.
  10. Clear writing is good writing and the most shareable writing. So write clearly.

Chapter V: The Writer’s Playbook

  1. Your audience is distracted. They are supposed to be working; instead, they are reading content on the internet.
  2. Your introduction is a pitch to convince the reader to spend his or her time on your
    article. It has to be great.
  3. Your conclusion is where you tell the reader what the takeaway is. What should he or she share?
  4. The article should hold together based solely off its topic sentences.
  5. Being right is important. If you write a popular essay, thousands of people will try to
    prove you wrong. In a similar vein, phrasing that can be interpreted to make you look
    bad will be interpreted to make you look bad.
  6. Focus on a narrowly­scoped argument and just try to prove one thing. That’s how you make sure you’re right.
  7. All first drafts are terrible. Have someone edit your work aggressively and scrutinize every fact and argument in it.
  8. Titles matter. They are what show up on social news sites and social networks.
  9. People who share your article are also sharing your title. Will the title make them look good?
  10. Find a common voice and perspective for your company’s written content. What’s your equivalent of “Priceonomics is my nerdy friend who tells me what’s up?”

Chapter VI: Writing Hacks

  1. Commit to writing one great piece of content based on your company’s data, then do everything possible to make it successful.
  2. If you taste success at writing content, you’ll want more success.
  3. Target 40 hours per blog post in the beginning.
  4. Once you get the hang of making great content, you’ll need to publish more frequently and will often struggle to come up with good ideas.
  5. The three kinds of information we suggest writing about are data, industries, and people.
  6. Present data in the way that tells a story and makes it spread.
  7. Write about industry knowledge your company has (or can research) that is valuable to other people.
  8. Everyone is the hero of his or her own world. When you write about people
    (entrepreneurs, inventors, CEOs, customers, etc.), follow the Hero Cycle.

Chapter VII: Hiring

  1. How hard do the people who run your company’s blog work to make great content?
  2. You need to create a system in which talented people can work hard and be rewarded by having their work matter and reach the intended audience.
  3. When screening applications, ask for a resume, writing sample, and list of ideas. The person with the best list of ideas will be the best candidate.
  4. If you like a candidate, pay them to do a freelance piece for you. If you are impressed by what she writes and you enjoy working with her, hire her.
  5. It takes two months of effort for a writer to hit his stride. Nothing works perfectly the first time, so be patient.

Chapter VIII: Why Content Matters

  1. Great content pays off financially.
  2. Content that advertises your business has to be better than content made by
    professional media companies.
  3. If your content has a mix of information that is great but not promotional for your
    business, that will help build your audience over time.
  4. Start by making sure you can produce great information that both spreads and promotes your business in some way. If you can pull that off, then slowly add non­promotional content to the mix. This is an advanced maneuver.
  5. Don’t write about topics you are obviously biased about; the internet will crucify you.
  6. You should take pride in your content and consider it to be in competition with that of professional media companies.
  7. Content is free stuff that is bundled with something that costs money. Media companies bundle it with icky ads; you should bundle it with a product you believe in.
  8. Write about information. Make it great. Have a plan for making it spread.

Next page: The One Page Checklist

One Page Checklist

Checklist before writing:

  1. Am I writing about data that is proprietary to my company? If not, am I discussing an industry or a person that I have information about because of my company’s knowledge?
  2. Who are 50 journalists I can contact about this piece?
  3. What will my pitch to journalists say?
  4. What channel (“supernode”) will this piece spread on? A subreddit? Hacker News?
    Digg? A particular forum?
  5. What would someone say about it in Tweet form? Why would someone share it, and what specifically would they say?

Checklist during writing:

  1. Does the introduction/hook compel the reader to read the rest of the article?
  2. Does the conclusion tell the reader the one takeaway they should get from the article and prime him or her to share it?
  3. Can you read just the topic sentences of the body of the article and still understand it?
  4. Could any phrases or wording possibly be misinterpreted to make you look bad, rude, arrogant, idiotic, and/or offensive? If there is a remote possibility of this, change it.
  5. Do you make any statements without proving them, or are there any gaps in your logic? Fix them.
  6. Is everything 100% correct and verifiable? Would you stake your job on it?
  7. Is the information you produced something that journalists will find valuable and people will find shareable? Then you have a good piece of content.

Checklist after writing:

  1. Email the 50 journalists you think would be interested in your post.
  2. Submit it to the forum where you think it might be popular (if appropriate, and if you are member of that community).
  3. Keep trying until the article starts spreading. All it takes is one “Yes” from a journalist, or one social news site to feature your work.
  4. You have to wage a campaign in the military sense of the word ­­ a series of operations intended to achieve a particular outcome. If you do not put in this effort after publishing, nothing will happen.

Next Page: How Priceonomics can help you

Want Priceonomics to Help you with Content Marketing?

We are obsessed with making great content and spreading information. Your blog and content should be generating tons of traffic, press, and leads for you. There is no reason not to.

Option 1: We’ll help you come up with good ideas and then edit them for you so they’re good
We help you come up with 4 good ideas per month and provide editorial assistance on making them great and spreadable. We’ll make sure you start with good ideas, your content is well made, and designed to spread. This is sort of like having an editor at Priceonomics as a part­time member of your team. Three month minimum commitment and pricing starts at $2K per month.

Option 2: We’ll hit a homerun content marketing campaign for you
We create incredible content for you based on your company’s data and information, at a less expensive price than if you produced it yourself. The content lives on your company’s site and you pay for performance. $2K per campaign. This is basically at cost. Additional $3K “success fee” if certain metrics are hit (5 PR mentions of piece or 5000 views or 100 Facebook likes). We only turn a profit on helping you create great content if you succeed.

Interested? email info@priceonomics.com


Note:​To participate in either of these programs you must use our Priceonomics Content
Tracker software to ensure this is a rigorous process. The software includes:
● Track Inbound link, visits, social sharing for your content
● Track the performance of your past content to set a baseline and goals
● Option to integration with Slack to notify you when content hits milestones

排名算法

机器学习来做排名运营机器人,做内容排名推荐。时间衰减 t, t^2、用户评价、VIP用户推荐、专家用户推荐(用户加权评价,每个用户权重不同)、分词,话题,信息熵机器学习估算、编辑推荐。
算法评价标准即是运营指标:用户收藏率、点赞率、评价率、转发率、浏览时长、打开次数。

参考文章:
http://www.cricode.com/2374.html

极客头条是通过什么算法将来进行排名的呢?

答案是:基于用户投票的排名!该算法在数据挖掘领域有这广泛的应用。

基于用户投票的排名算法有多种,下面我们先来介绍各种基于用户投票的排名算法,并最终给出极客头条可能使用的一种排名算法。

排名算法一:Delicious算法

Delicious算法是最简单也是最直观的一种用户投票排名算法:它按照”过去 60 分钟内被收藏的次数”(也就是极客头条中的“顶”按钮)进行排名,每过 60 分钟,就统计一次。

这个算法的优点是比较简单、容易部署、内容更新相当快;缺点是排名变化不够平滑,前一个小时还排在前列的内容,往往第二个小时就一落千丈。

排名算法二:Hacker News算法

Hacker News算法由Paul Graham(此人著有《黑客与画家》一书,名震江湖)设计实现,它通过计算每篇文章的得分来进行排名。

计算公式如下:

`“Score” = (P-1) / (T + 2)^G`

其中,
P 表示帖子的得票数,减去 1 是为了忽略发帖人的投票。
T 表示距离发帖的时间(单位为小时),加上 2 是为了防止最新的帖子导致分母过小(之所以选择2,可能是因为从原始文章出现在其他网站,到转贴至 Hacker News,平均需要两个小时)。
G 表示”重力因子”(gravityth power),即将帖子排名往下拉的力量,默认值为1.8
因此,决定一个文章排名的因素主要有三个:得票数P、距离发帖的时间T、重力因子G。其中G在实现中可以进行适当调整,得到一个理想的经验值,从而保证该推荐算法的准确性!

Hacker News算法的特点是用户只能投赞成票,不能投反对票!对于一些需要正反观点的网站,Hacker News并不适用。

这个算法的特点是不是有点像极客头条的运作方式了?别急,接着往下看。

详细介绍:Hacker News算法

算法三:Reddit算法

顾名思义,这个算法是由美国社交新闻网站Reddit提出的。该算法相对于Hacker News算法来说更加复杂,支持正反投票。该算法考虑了如下几个因素:

1)帖子的新旧程度t
2)赞成票与反对票的差x
3)投票方向y:y 是一个符号变量,表示对文章的总体看法,如果赞成票居多,y就是 +1;如果反对票居多,y就是-1;如果赞成票和反对票相等,y就是0
4)帖子的受肯定程度z:z 表示赞成票超过反对票的数量。如果赞成票少于或等于反对票,那么z就等于1
综合以上四个因素看,Reddit算法似乎比Hacker News算法算法更合理、更靠谱!但是,请继续往下看!

详细介绍:Reddit算法

算法四:Stack Overflow算法

看到这个名字,大家是不是很眼熟?就是那个大名鼎鼎的网站Stack Overflow,世界排名第一的程序员问答社区。

Stack Overflow算法的作用是,找出某段时间内的热点问题,即哪些问题最被关注、得到了最多的讨论。该算法主要考虑如下几个因素:

Qviews(问题的浏览次数)
Qscore(问题得分)和 Qanswers(回答的数量)
Ascores(回答得分)
Qage(距离问题发表的时间)和 Qupdated(距离最后一个回答的时间)
Stack Overflow算法是专门针对热点问题进行排名而设计的算法。最终的排名与参与度(Qviews 和 Qanswers)和质量(Qscore 和 Ascores)成正比,与时间(Qage 和 Qupdated)成反比。

详细介绍:Stack Overflow算法

算法五:牛顿冷却定律

“牛顿冷却定律”描述起来非常简单,用一句话概况就是:物体的冷却速度,与其当前温度与室温之间的温差成正比。

其基本思路如下:

我们把”热文排名”想象成一个”自然冷却”的过程,那么有如下几点成立:

(1)任一时刻,网站中所有的文章,都有一个”当前温度”,温度最高的文章就排在第一位。
(2)如果一个用户对某篇文章投了赞成票,该文章的温度就上升一度。
(3)随着时间流逝,所有文章的温度都逐渐”冷却”。
接下来,我们需要做的事是,把上面三句话抽象成一个数学模型,便得到牛顿冷却定律排名算法。

详细介绍:牛顿冷却定律

算法六:威尔逊区间(数学来了,你准备好了吗?)

威尔逊区间排名算法是一个完全基于概率统计的排名算法。算法基于如下假定:

(1)每个用户的投票都是独立事件。
(2)用户只有两个选择,要么投赞成票,要么投反对票。
(3)如果投票总人数为n,其中赞成票为k,那么赞成票的比例p就等于k/n。
基于上述假定,得到威尔逊区间排名算法步骤如下:

第一步,计算每个项目的”好评率”(即赞成票的比例)。
第二步,计算每个”好评率”的置信区间(以 95% 的概率)。
第三步,根据置信区间的下限值,进行排名。这个值越大,排名就越高。
这样做的原理是,置信区间的宽窄与样本的数量有关。比如,A有 8 张赞成票,2张反对票;B有 80 张赞成票,20张反对票。这两个项目的赞成票比例都是 80%,但是B的置信区间(假定[75%, 85%])会比A(假定[70%, 90%])窄得多,因此B的置信区间的下限值(75%)会比A(70%)大,所以B应该排在A前面。

置信区间的实质,就是进行可信度的修正,弥补样本量过小的影响。如果样本多,就说明比较可信,不需要很大的修正,所以置信区间会比较窄,下限值会比较大;如果样本少,就说明不一定可信,必须进行较大的修正,所以置信区间会比较宽,下限值会比较小。

“威尔逊区间”排名算法解决了投票人数过少、导致结果不可信的问题。

详细介绍:威尔逊区间排名算法

算法七:贝叶斯平均(大神出没)

程序员必须知道的10个基础算法及其讲解一文中,我们就介绍了贝叶斯概率。

“威尔逊区间”能解决了投票人数过少、导致结果不可信的问题,例如如果只有 2 个人投票,”威尔逊区间”的下限值会将赞成票的比例大幅拉低。这样做虽然保证了排名的可信性,但却带来了另一个问题:排行榜前列总是那些票数最多的项目,新项目或者冷门的项目,很难有出头机会,排名可能会长期靠后。

贝叶斯平均排序算法考虑了如下几个因素:

1)C,投票人数扩展的规模,是一个自行设定的常数,与整个网站的总体用户人数有关,2)可以等于每个项目的平均投票数。
3)n,该项目的现有投票人数。
4)x,该项目的每张选票的值。
5)m,总体平均分,即整个网站所有选票的算术平均值。
这种算法被称为“贝叶斯平均”(Bayesian average)。因为它借鉴了“贝叶斯推断”(Bayesian inference)的思想:既然不知道投票结果,那就先估计一个值,然后不断用新的信息修正,使得它越来越接近正确的值。

上述几个因素中的m(总体平均分)是”先验概率”,每一次新的投票都是一个调整因子,使总体平均分不断向该项目的真实投票结果靠近。投票人数越多,该项目的”贝叶斯平均”就越接近算术平均,对排名的影响就越小。

因此,这种方法可以给一些投票人数较少的项目,以相对公平的排名。

详细介绍:贝叶斯平均

好了,几种有名的基于用户投票的排名算法介绍完了,上述算法看起来是一个比一个靠谱,用起来是一个比一个难用!

回到本文的题目,如果是你来实现这个排名算法你会采用哪一种呢?极客头条究竟采用了哪种排名算法?

根据本人有限的知识体系,极客头条的排名方法很有可能就是Stack Overflow算法的一个具体应用。

结合本文介绍的算法,以下是本人对于极客头条产品设计实现的几点思考,路过的高手见笑了!
第一:采用Stack Overflow算法进行推荐排名,并根据自身的特点进行适当的修改简化。

理由主要有两点:

1)从表面上看,极客头条该算法主要参考的因素为发布时间、用户赞成数、评论数目、点击数目(很有可能极客头条也将点击数目加入了考量,这个无法肯定)。

2)Stack Overflow算法相对比较简单,易于实现调优。事实上简单的算法往往能达到最好的效果,这就是所谓的大道至简吧!

有兴趣的可以看这篇文章:Google 阿卡 47 的制造者阿米特.辛格博士,体验一下大道至简!

第二:为什么极客头条不设置”踩”按钮?

这是因为,如果将反对按钮一加进来,排名结果立马变得不可控,效果会马上变差。主要原因是,不排除有个别投机分子,会将极客头条当做导流量的渠道(或者仅仅是出于好玩,随便一踩),一旦有作弊的情况,比如出现刻意踩别人的文章,顶自己的文章,问题将变得麻烦,推荐效果也就难以保障了。去掉反对按钮后,问题会简单很多。关于作弊的问题,有兴趣的可以看这篇文章:闪光的不一定是金子——谈谈搜索引擎作弊问题(Search Engine Anti-SPAM)

第三:极客头条可以开放一个管理员权限,管理员能够人为将文章置顶,让排名算法结果适当受人工控制,而不是全自动化。机器,在目前的情况下,还不能完全替代人类,尤其在内容编辑方面。

第四:如果人力物力允许,可以将机器学习应用到排名上来,实现一个更加智能的排名算法。

想象一下这个场景:在三分钟之内同时有50篇文章被提交审核通过(事实上,这应该有可能,因为极客头条的编辑就有30个之多)。假设在几分钟内,没有生成点赞数据和评论数据,那么,算法就只能按时间来排序了,仅仅按时间排序,效果肯定不尽如人意。
一个可能的解决方案是:根据以往的极客头条数据,统计出一些受欢迎的主题、关键字等信息,建立一个数学模型。当同时有大量新文章到来时,先对新文章进行分词,再根据这个数学模型,对每篇文章进行评分,按得分排行。当然,可以考虑文章来源,对于可靠的文章来源,可以给予更高的评分,例如:伯乐网、CSDN、快课网(this is a kidding)等等网站。

希腊字母在QWERTY键盘上的布局

在做数学学习时,有时需要输入希腊字母,熟练掌握希腊字母的输入法有利于提高效率。Mac内置了希腊输入法,以下是在qwerty键盘上的布局。

小写:

1
2
3
; ς ε ρ τ υ θ ι ο π [ ] \
α σ δ φ γ η ξ κ λ ΄ '
ζ χ ψ ω β ν μ , . /

大写:

1
2
3
: Σ Ε Ρ Τ Υ Θ Ι Ο Π [ ] \
Α Σ Δ Φ Γ Η Ξ Κ Λ ¨ "
Ζ Χ Ψ Ω Β Ν Μ < > ?

字母序:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
A B C D E F G
α β ψ δ ε φ γ
Α Β Ψ Δ Ε Φ Γ
H I J K L M N
η ι ξ κ λ μ ν
Η Ι Ξ Κ Λ Μ Ν
O P Q R S T
ο π ; ρ σ τ
Ο Π : Ρ Σ Τ
U V W X Y Z
θ ω ς χ υ ζ
Θ Ω Σ Χ Υ Ζ

常用:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
α a alpha
β b beta
γ g gamma 音近
Γ G gamma
δ d delta
Δ D delta
ε e epsilon
ζ z zeta
η h eta 形近 音艾塔
θ u theta 无规律
ι i iota
κ k kappa
λ l lambda 音近
μ m mu 音近 音谬
ν n nu 音近 音纽
ξ j xi 无规律 音zie
ο o omicron
π p pi 音
Π P pi
ρ r rho 音近 肉
σ s sigma 音近
Σ S sigma 音近
τ t tau
υ y upsilon
φ f phi 音fie
Φ F phi
χ x chi 音 kie
ψ c psi 音塞 sigh
Ψ C psi 音塞 sigh
ω v omega 无规律
Ω V omega 无规律

TCP 连接优化

see: http://www.cnblogs.com/fczjuever/archive/2013/04/17/3026694.html

/etc/sysctl.conf文件

  /etc/sysctl.conf是一个允许你改变正在运行中的Linux系统的接口。它包含一些TCP/IP堆栈和虚拟内存系统的高级选项,可用来控制Linux网络配置,由于/proc/sys/net目录内容的临时性,建议把TCPIP参数的修改添加到/etc/sysctl.conf文件, 然后保存文件,使用命令“/sbin/sysctl –p”使之立即生效。具体修改方案参照上文:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
net.core.rmem_default = 256960
net.core.rmem_max = 513920
net.core.wmem_default = 256960
net.core.wmem_max = 513920
net.core.netdev_max_backlog = 2000
net.core.somaxconn = 2048
net.core.optmem_max = 81920
net.ipv4.tcp_mem = 131072 262144 524288
net.ipv4.tcp_rmem = 8760 256960 4088000
net.ipv4.tcp_wmem = 8760 256960 4088000
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_sack = 1
net.ipv4.tcp_fack = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.tcp_max_syn_backlog = 2048