Читаю сейчас этот курс в КПИ и хотел найти желающих (так сказать, гуру в этой области) для проведения пары гостевых лекций. Если поставить себя на место руководителя компании, находящегося в постоянном поиске сотрудников, или даже проджект менеджера, которому тоже нужно участвовать в подборе персонала для своего проекта, мне это было бы интересно даже с чисто утилитарной точки зрения. Не говоря уже об общественной полезности :)
Попробовал обратиться через форум devua: http://www.developers.org.ua/forum/topic/384. Пока безрезультатно. Может, я не туда обращаюсь или у нас так просто не принято?
2009-02-17
Sending SMTP mail with UTF-8 characters from Common Lisp
Today I explored the topic of sending a base64-encoded SMTP message and it turned out to be rather tricky. I discovered, that for this task (if you don't rely on Franz's infrastructure) effectively 4 libraries should be used. And as they are very scarcely documented, I decided to write this short description.
Initially I knew very little about SMTP protocol (except for HELO and EHLO). So I started with the plain CL-SMTP functionality of SEND-EMAIL [1]. The function is not documented and has no own errors, it just re-signals the errors of the underlying USOCKET library. That's why it required some effort on my part to understand, why the following code produces USOCKET:UKNOWN-ERROR [2]:
It turned out, that the SMTP server just didn't accept the non-ascii characters in the body, because the default encoding is 7bit.
In the process I discovered the useful debugging feature of CL-SMTP:
So I asked Google and found this article by Hans Hübner, where he explains his enhancements of CL-SMTP (currently integrated in the codebase) and describes how to send attachments with it. But to properly apply his examples to my case I first had to learn a couple of things about MIME. In the example Hans uses multipart/mixed Content-type for sending a message with attachments. But it is not necessary for the simple task of sending a text message in UTF-8 charset. For that you can use text/plain Content-type and UTF-8 charset. But for non-ascii symbols to be accepted by the mail server they should be encoded (usually) in base64 (Content-encoding header). All this activities are handled with CL-MIME library. The library is quite self-explanatory so the lack of documentation doesn't hurt, except for a couple of moments.
First of all, properly formatted MIME text data is produced with the function PRINT-MIME [4], which takes the CLOS MIME object with the appropriately set fields. The problem is, that the generated data contains both MIME headers and the part, which should go into the message body. So the function's output can't be used as an argument to SEND-EMAIL, because the headers will go to the data section, and the mail-client won't consider them (which will result in decoded body). For this case (and other cases, when you need more control of the process of SMTP interaction) Hans has created a high-level macro WITH-SMTP-MAIL [5]. There's a little catch in it as well: unlike SEND-EMAIL it accepts the list of recipients (while the former — a sole recipient string).
The second thing, which caused me most trouble, actually, was the tricky and once again undocumented handling of :CONTENT initarg of the MIME objects [6]. When you provide :ENCODING initarg, such as, primarily, :BASE64, the content part of the data, emitted with PRINT-MIME, will be subjected to the appropriate encoding (performed by CL-BASE64). The interesting thing is, that it will produce wrong output for UTF-8 strings. The proper argument format is an octet array. And you need a function to reduce the string to this format.
ARNESI is a useful library. It provides a lot of small utilities from different spheres. So I was glad to find out, that the needed function STRING-TO-OCTETS [7] is provided by it, because the lib was already utilized in my project.
It's worth mentioning, that if non-ascii characters are used inside MIME body, they can be sent as is. But, AFAIU, it's not so robust as in base64 encoded form.
So the final code turned out to be like this:
CL-SMTP
Initially I knew very little about SMTP protocol (except for HELO and EHLO). So I started with the plain CL-SMTP functionality of SEND-EMAIL [1]. The function is not documented and has no own errors, it just re-signals the errors of the underlying USOCKET library. That's why it required some effort on my part to understand, why the following code produces USOCKET:UKNOWN-ERROR [2]:
(cl-smtp:send-email "localhost" "noreply@our.domain.net" "test@gmail.com"
"subject"
"тест")
Explanation: Sending mail through the SMTP server on localhost from noreply@test.com to test@gmail.com with the body "тест".
It turned out, that the SMTP server just didn't accept the non-ascii characters in the body, because the default encoding is 7bit.
In the process I discovered the useful debugging feature of CL-SMTP:
(setf cl-smtp::*debug* t)
[3]. It will print the SMTP interaction log.CL-MIME
So I asked Google and found this article by Hans Hübner, where he explains his enhancements of CL-SMTP (currently integrated in the codebase) and describes how to send attachments with it. But to properly apply his examples to my case I first had to learn a couple of things about MIME. In the example Hans uses multipart/mixed Content-type for sending a message with attachments. But it is not necessary for the simple task of sending a text message in UTF-8 charset. For that you can use text/plain Content-type and UTF-8 charset. But for non-ascii symbols to be accepted by the mail server they should be encoded (usually) in base64 (Content-encoding header). All this activities are handled with CL-MIME library. The library is quite self-explanatory so the lack of documentation doesn't hurt, except for a couple of moments.
First of all, properly formatted MIME text data is produced with the function PRINT-MIME [4], which takes the CLOS MIME object with the appropriately set fields. The problem is, that the generated data contains both MIME headers and the part, which should go into the message body. So the function's output can't be used as an argument to SEND-EMAIL, because the headers will go to the data section, and the mail-client won't consider them (which will result in decoded body). For this case (and other cases, when you need more control of the process of SMTP interaction) Hans has created a high-level macro WITH-SMTP-MAIL [5]. There's a little catch in it as well: unlike SEND-EMAIL it accepts the list of recipients (while the former — a sole recipient string).
CL-BASE64 & ARNESI
The second thing, which caused me most trouble, actually, was the tricky and once again undocumented handling of :CONTENT initarg of the MIME objects [6]. When you provide :ENCODING initarg, such as, primarily, :BASE64, the content part of the data, emitted with PRINT-MIME, will be subjected to the appropriate encoding (performed by CL-BASE64). The interesting thing is, that it will produce wrong output for UTF-8 strings. The proper argument format is an octet array. And you need a function to reduce the string to this format.
ARNESI is a useful library. It provides a lot of small utilities from different spheres. So I was glad to find out, that the needed function STRING-TO-OCTETS [7] is provided by it, because the lib was already utilized in my project.
It's worth mentioning, that if non-ascii characters are used inside MIME body, they can be sent as is. But, AFAIU, it's not so robust as in base64 encoded form.
Result
So the final code turned out to be like this:
(defun send-email (text &rest reciepients)
"Generic send SMTP mail with some TEXT to RECIEPIENTS"
(cl-smtp:with-smtp-mail (out "localhost" "noreply@fin-ack.com" reciepients)
(cl-mime:print-mime out
(make-instance 'cl-mime:text-mime
:encoding :base64 :charset "UTF-8"
:content (arnesi:string-to-octets text :utf-8))
t t)))
Lessons learned
- To send plain text ascii email use CL-SMTP:SEND-EMAIL
- If USOCKET:UKNOWN-ERROR is signaled, most probably, the arguments are not properly formatted
- For debugging (setf cl-smtp::*debug* t)
- To efficiently use MIME utilize CL-SMTP:WITH-SMTP-EMAIL in conjunction with CL-MIME:PRINT-MIME
- You need to supply an octet vector, not a string to CL-MIME:TEXT-MIME's :CONTENT initarg.
- To break a UTF-8 string into octets use ARNESI:STRING-TO-OCTETS
2009-02-01
Nokia Locate Sensor
From idea:
...to implementation
Received: by 10.141.28.2 with HTTP; Mon, 22 Oct 2007 05:25:31 -0700 (PDT)
Message-ID: <89dc7c5b0710220525p4e33dd50i6cb85d85cc1b0903@mail.gmail.com>
Date: Mon, 22 Oct 2007 15:25:31 +0300
From: Vsevolod
To: info@janchipchase.com
Subject: idea to help not forget phones
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_Part_8479_1626797.1193055931431"
Delivered-To: vseloved@gmail.com
------=_Part_8479_1626797.1193055931431
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Hello Jan,
My name is Vsevolod Dyomkin. I wanted to share a design idea related to
mobile phones with you, as an only person I can presently reach, who can
possibly facilitate its implementation.
Today I have seen your presentation at TED.com concerning the uses of mobile
phones. What interested me in it is the mention of places, which serve as
gravity centers for important carried items. In my opinion, the existence of
such places, in spite of the natural need for them, brings about one big
potential inconvenience…
Take my example, I work inside a big building and at my workplace the level
of mobile signal is very poor, so I'm forced to leave my phone in the other
part of the room where the connectivity is better. This sometimes leads to
unpleasant situations when I forget the phone, leaving the room and
building. Other example can be, when a person comes to a party and leaves
her phone in some place not to be distracted by it. Afterwards he'll pretty
probably forget about it. This may not only be the case with phones, but
also with other carried items as well.
To prevent such situations I've come up with the following idea: to have a
small device which informs you (like beeps), when you part, for example, 10
meters from the item. It will consist of two parts — a number of RFID tags
(in a form-factor of small round colored stickers), which can be sticked to
a mobile phone, a key, an id card etc. and a receiver/speaker, which can be
a charm on a keyring or a bangle, which beeps. The receiver can optionally
show the color of a sticker, which caused an alarm.
To me this is an example of delegation of mundane/error-prone tasks to
technology :-) — in this case the delegation of the necessity to flap one's
pockets...
If you find it interesting, fell free to contact me
Best regards
Vsevolod
...to implementation
How lexical scope is important
"Fexprs more flexible, powerful, easier to learn? (Newlisp vs CL)" @ c.l.l.
Rainer Joswig (with some participation from Kaz Kylheku and Pascal Bourguignon) on a practical example explain, what problems of dynamic scope (still used in the suggested "improved" newLisp, which turns out to be old, actually :) are solved by lexical scope.
Bonus: how to create lisp-style special global variables in C++ (and a discussion of what can be improved in CL in this regard)
Rainer Joswig (with some participation from Kaz Kylheku and Pascal Bourguignon) on a practical example explain, what problems of dynamic scope (still used in the suggested "improved" newLisp, which turns out to be old, actually :) are solved by lexical scope.
Bonus: how to create lisp-style special global variables in C++ (and a discussion of what can be improved in CL in this regard)
Subscribe to:
Posts (Atom)