database - Following multiple log files efficiently -
i'm intending create programme can permanently follow large dynamic set of log files copy entries on database easier near-realtime statistics. log files written diverse daemons , applications, format of them known can parsed. of daemons write logs 1 file per day, apache's cronolog creates files access.20100928. files appear each new day , may disappear when they're gzipped away next day.
the target platform ubuntu server, 64 bit.
what best approach efficiently reading log files?
i think of scripting languages php either open files theirselves , read new data or use system tools tail -f
follow logs, or other runtimes mono. bash shell scripts aren't suited parsing log lines , inserting them database server (mysql), not mention easy configuration of app.
if programme read log files, i'd think should stat() file once in second or size , open file when it's grown. after reading file (which should return complete lines) call tell() current position , next time directly seek() saved position continue reading. (these c function names, wouldn't want in c. , mono/.net or php offer similar functions well.)
is constant stat()ing of files , subsequent opening , closing problem? how tail -f
that? can keep files open , notified new data select()? or return @ end of file?
in case i'm blocked in kind of select() or external tail, i'd need interrupt every 1, 2 minutes scan new or deleted files shall (no longer) followed. resuming tail -f not reliable. should work better own saved file positions.
could use kind of inotify (file system notification) that?
if want know how tail -f
works, why not @ the source? in nutshell, don't need periodically interrupt or stat() scan changes files or directories. that's inotify does.
Comments
Post a Comment