Identifying a C# or C++ function start in a line count program -
i have program, written in c#, when given c++ or c# file, counts lines in file, counts how many in comments , in designer-generated code blocks. want add ability count how many functions in file , how many lines in functions. can't quite figure out how determine whether line (or series of lines) start of function (or method).
at least, function declaration return type followed identifier , argument list. there way determine in c# token valid return type? if not, there way determine whether line of code start of function? need able reliably distinguish like.
bool isthere() { ... }
from
bool ishere = isthere()
and
isthere()
as other function declaration lookalikes.
start scanning scopes. need count open braces { , close braces } work way through file, know scope in. need parse // , /* ... */ scan file, can tell when in comment rather being real code. there's #if, have compile code know how interpret these.
then need parse text prior scope open braces work out are. functions may in global scope, class scope, or namespace scope, have able parse namespaces , classes identify type of scope looking at. can away simple parsing (most programmers use similar style - example, it's uncommon put blank lines between 'class fred' , open brace. might write 'class fred {'. there chance put junk on line - e.g. 'template class __declspec myweirdmacro fred {'. however, can away pretty simple "does line contain word 'class' whitespace on both sides? heuristic work in cases.
ok, know inside namepace, , inside class, , find new open scope. method?
the main identifying features of method are:
- return type. sequence of characters , can many tokens ("__dllexport const unsigned myint32typedef * &"). unless compile entire project have no chance.
- function name. single token (but watch out "operator =" etc)
- an pair of brackets containing 0 or more parameters or 'void'. best clue.
- a function declaration not include reserved words precede many scopes (e.g. enum, class, struct, etc). , may use reserved words (template, const etc) must not trip over.
so search blank line, or line ending in ; { or } indicates end of previous statement/scope. grab text between point , open brace of scope. extract list of tokens, , try match parameter-list brackets. check none of tokens reserved words (enum, struct, class etc).
this give "reasonable degree of confidence" have method. don't need parsing pretty high degree of accuracy. spend lot of time finding special cases confuse "parser", if working on reasonably consistent code-base (i.e. own company's code) you'll able identify methods in code easily.
Comments
Post a Comment