Allow different insertion points for each standard #52

mg979 · 2019-10-06T21:23:21Z

pattern['comment']['insert'] can be defined as a dictionary, so that
each b:doge_doc_standard can behave differently in this regard

Prelude

Thank you for helping out DoGe!

By contributing to DoGe you agree to the following statements (Replace [ ] with [x] with those you agree with):

I have read and understood the Contribution Guidelines.
I have read and understood the Code of Conduct.

Why this PR?

If different doc standards are supported, they could insert the documentation in different positions.

pattern['comment']['insert'] can be defined as a dictionary, so that each b:doge_doc_standard can behave differently in this regard

kkoomen · 2019-11-25T13:42:25Z

@mg979 Is this something you need or has this a specific reason? I don't think this is useful, since documentation is written in the same positioning for every doc standard. For example: Python will always be inside the function instead of above the function, no matter the doc standard. I don't see any logical reason to alter this for a specific doc standard.

mg979 · 2019-11-25T14:15:28Z

Python is a special case because it has a well defined doc standard, other languages don't have one. In such languages, one could have (for example) a doc standard for important functions and one for helpers, or depending on the type of script, one may want to put the docstring in different positions. It's something I was using for vimscript. Still, if you don't think it's useful, no problem. If you're unsure you can leave it for later.

kkoomen · 2019-11-25T15:19:30Z

If you can give me a scenario you would use it for I will consider implementing it.

mg979 · 2019-11-26T10:10:31Z

Looking back at it, the way I did it is really ugly. The problem is that it's not easy to do it cleanly with the way buffer doc standards are implemented right now. At the same time I think this should be addressed, considering also #60. How can you implement custom doc standards, if they aren't allowed to set their insert point?

I think the main problem is how b:doge_patterns is implemented, as a list and not as a dictionary of doc standards. Each doc standard should be a self-contained unity, but they're not.

Example from lua doc standard:

let b:doge_patterns = []

call add(b:doge_patterns, {
\  'match': '\m^\%(local\s*\)\?function\s*\%([[:alnum:]_:.]\+[:.]\)\?\%([[:alnum:]_]\+\)\s*(\(.\{-}\))',
\  'match_group_names': ['parameters'],
\  'parameters': {
\    'match': s:parameters_match_pattern,
\    'match_group_names': ['name'],
\    'format': {
\      'ldoc': '@param {name} !description',
\    },
\  },
\  'comment': {
\    'insert': 'above',
\    'template': {
\      'ldoc': [
\        '-- !summary',
\        '-- !description',
\        '%(parameters|-- {parameters})%',
\      ],
\    },
\  },
\})

You define b:doge_patterns as a list with patterns, and the doc standard comment formatting inside of them.

Instead I think it would be better like:

" just extend the current dict, if it's already existing
let b:doge_doc_standards = get(b:, 'doge_doc_standards', {})

let b:doge_doc_standards.ldoc = [{
\  'match': '\m^\%(local\s*\)\?function\s*\%([[:alnum:]_:.]\+[:.]\)\?\%([[:alnum:]_]\+\)\s*(\(.\{-}\))',
\  'match_group_names': ['parameters'],
\  'parameters': {
\    'match': s:parameters_match_pattern,
\    'match_group_names': ['name'],
\    'format': '@param {name} !description',
\  },
\  'comment': {
\    'insert': 'above',
\    'template': [
\        '-- !summary',
\        '-- !description',
\        '%(parameters|-- {parameters})%',
\    ],
\  },
\}]

That is, each doc standard as a list with its own patterns, and everything that concerns a specific doc standard is isolated from the rest. You can't use 2 doc standards at the same time anyway.

This allows doc standards to be defined in a custom ftplugin, and doge will just pick them up and use them as it finds them, when they are requested.

I'm closing this PR because definitely there is more to it and I think it would need a different approach.

kkoomen · 2019-11-26T14:22:21Z

@mg979 I am indeed looking for something that could solve issue #60 and I remember again why I choose this. It's A or B I had to choose. I had to duplicate doc standards (current version) or I had to duplicate patterns and their regex (your suggestion).

I'd be more than happy to get your second thought on this. My suggestion is that we use dictionaries indeed and use a base for every pattern which we extend. This is what my suggestion is using python (which cleans up a lot of code!)

Current implementation

call add(b:doge_patterns, {
\  'match': '\m^def\s\+\%([^(]\+\)\s*(\(.\{-}\))\%(\s*->\s*\(.\{-}\)\)\?\s*:',
\  'match_group_names': ['parameters', 'returnType'],
\  'parameters': {
\    'match': '\m\([[:alnum:]_]\+\)\%(\s*:\s*\([[:alnum:]_.]\+\%(\[[[:alnum:]_[\],[:space:]]*\]\)\?\)\)\?\%(\s*=\s*\([^,]\+\)\)\?',
\    'match_group_names': ['name', 'type', 'default'],
\    'format': {
\      'reST': ':param {name} {type|!type}: !description',
\      'numpy': [
\        '{name} : {type|!type}',
\        '\t!description',
\      ],
\    },
\  },
\  'comment': {
\    'insert': 'below',
\    'template': {
\      'reST': [
\        '"""',
\        '!description',
\        '',
\        '%(parameters|{parameters})%',
\        '%(returnType|:rtype {returnType}: !description)%',
\        '"""',
\      ],
\      'numpy': [
\        '"""',
\        '!summary',
\        '',
\        '!description',
\        '%(parameters|)%',
\        '%(parameters|Parameters)%',
\        '%(parameters|----------)%',
\        '%(parameters|{parameters})%',
\        '%(returnType|)%',
\        '%(returnType|Returns)%',
\        '%(returnType|-------)%',
\        '%(returnType|{returnType}:)%',
\        '%(returnType|\t!description)%',
\        '"""',
\      ],
\    },
\  },
\})

Newer implementation

" This is our base for the "functions" pattern.
" Every doc standard will at least extend this.
let s:func_pattern_base = {
\  'match': '\m^def\s\+\%([^(]\+\)\s*(\(.\{-}\))\%(\s*->\s*\(.\{-}\)\)\?\s*:',
\  'match_group_names': ['parameters', 'returnType'],
\  'parameters': {
\    'match': '\m\([[:alnum:]_]\+\)\%(\s*:\s*\([[:alnum:]_.]\+\%(\[[[:alnum:]_[\],[:space:]]*\]\)\?\)\)\?\%(\s*=\s*\([^,]\+\)\)\?',
\    'match_group_names': ['name', 'type', 'default'],
\  },
\  'comment': {
\    'insert': 'below',
\  },
\}

" Here we only have to add the parameters and comment.
let b:doge_patterns.reST = map([
\  {
\    'parameters': {
\      'format': ':param {name} {type|!type}: !description'
\    },
\    'comment': {
\      'template': [
\        '"""',
\        '!description',
\        '',
\        '%(parameters|{parameters})%',
\        '%(returnType|:rtype {returnType}: !description)%',
\        '"""',
\      ],
\    },
\  },
\], {_, pattern -> extend(s:func_pattern_base, pattern)})

" Here we only have to add the parameters and comment.
let b:doge_patterns.numpy = map([
\  {
\    'parameters': {
\      'format': [
\        '{name} : {type|!type}',
\        '\t!description',
\      ],
\    },
\    'comment': {
\      'template': [
\        '"""',
\        '!description',
\        '',
\        '%(parameters|{parameters})%',
\        '%(returnType|:rtype {returnType}: !description)%',
\        '"""',
\      ],
\    },
\  },
\], {_, pattern -> extend(s:func_pattern_base, pattern)})

Note that I use extend(s:func_pattern_base, pattern) inside map([{...}, {...}, {...}], {_, pattern -> extend(s:func_pattern_base, pattern)}) which does not work since extend is not merging nested dictionaries. We would have to create a helpers function for this, such as deepextend().

NOTE when we implement a deepextend() helper function: The user can define patterns in their ftplugin/<filetype>.vim which should have more prio, so if we come across an already existing key we should keep the value from the user to ensure they can overwrite every key they want.

Another thing I would like to change (but not sure if this is the way to go) is instead of making let b:doge_patterns.<doc-standard> a list, I think we should make these dictionaries as well and use named keys. This way the user can overwrite these by name instead of index. For example:

This:

let b:doge_patterns.numpy = map([
\  {
\    'parameters': {
\      'format': [
\        '{name} : {type|!type}',
\        '\t!description',
\      ],
\    },
\    'comment': {
\      'template': [
\        '"""',
\        '!description',
\        '',
\        '%(parameters|{parameters})%',
\        '%(returnType|:rtype {returnType}: !description)%',
\        '"""',
\      ],
\    },
\  },
\  {
\     " ...
\  },
\], {_, pattern -> extend(s:func_pattern_base, pattern)})

will become this:

let b:doge_patterns.numpy = map({
\  'functions': {
\    'parameters': {
\      'format': [
\        '{name} : {type|!type}',
\        '\t!description',
\      ],
\    },
\    'comment': {
\      'template': [
\        '"""',
\        '!description',
\        '',
\        '%(parameters|{parameters})%',
\        '%(returnType|:rtype {returnType}: !description)%',
\        '"""',
\      ],
\    },
\  },
\  'class_methods': {
\    " ...
\  },
\}, {_, pattern -> extend(s:func_pattern_base, pattern)})

I think this is already a solid solution and I am looking forward to implement this.

Let me know if you have any other suggestions or things that should be different from your point of view.

mg979 · 2019-11-26T16:55:23Z

In the stock doc standards (the ones you provide) you can organize the code as you wish. Personally I'd prefer a more linear implementation (without nested dictionaries, that can be easily extended), but that would require other changes. Anyway I don't mind some code repetition, in the end, each ftplugin does its own thing, even if there's some repeated code it's hard to get lost.

But the main thing is: that structure will have to be used by custom doc standards as well, so in my opinion it's better to keep things easy. Personally I don't see the need for nested dictionaries.

Then, b:doge_patterns should be renamed, because it would be a dictionary of doc standards, not of patterns.

A non-nested template would be like:

let s:template = {
\  'match': ...,
\  'match_group_names': ...,
\  'parameters': ...,
\  'parameters_group_names': ...,
\  'parameters_format': ...,
\  'insert': ...,
\  'template': ...,
\}

where the parameters_x keys are only processed if the parameters key is also present.

For example, a non-nested implementation based on your python snippet:

" This is our base for the "functions" pattern.
" Every doc standard will at least extend this.
let s:func_template = {
\  'match': '\m^def\s\+\%([^(]\+\)\s*(\(.\{-}\))\%(\s*->\s*\(.\{-}\)\)\?\s*:',
\  'match_group_names': ['parameters', 'returnType'],
\  'parameters': '\m\([[:alnum:]_]\+\)\%(\s*:\s*\([[:alnum:]_.]\+\%(\[[[:alnum:]_[\],[:space:]]*\]\)\?\)\)\?\%(\s*=\s*\([^,]\+\)\)\?',
\  'parameters_group_names': ['name', 'type', 'default'],
\  'insert': 'below',
\}

" Here we only have to add the parameters and comment.
let b:doge_doc_standards.reST = [
\  extend({
\    'parameters_format': ':param {name} {type|!type}: !description'
\    'template': [
\      '"""',
\      '!description',
\      '',
\      '%(parameters|{parameters})%',
\      '%(returnType|:rtype {returnType}: !description)%',
\      '"""',
\    ],
\  }, s:func_template),
\]

" Here we only have to add the parameters and comment.
let b:doge_doc_standards.numpy = [
\  extend({
\    'parameters_format': [
\        '{name} : {type|!type}',
\        '\t!description',
\      ],
\    'template': [
\      '"""',
\      '!description',
\      '',
\      '%(parameters|{parameters})%',
\      '%(returnType|:rtype {returnType}: !description)%',
\      '"""',
\    ],
\  }, s:func_template),
\]

mg979 · 2019-11-26T17:06:45Z

About the thought of making a doc standard a dict: I'm not sure it's a good idea, because:

a doc standard should be its own thing, predictable in what it does, the user can define its own, and copy part of the code from the default ones
more importantly, inside the doc standard there are patterns, how would doge process them? Right now they are processed in order, so that the regex patterns are tried in a certain order (since b:doge_patterns is a list), not possible with a dictionary.
you would again have the user mess with nested dictionaries...

kkoomen · 2019-11-27T00:28:08Z

I agree with both suggestions. I also prefer linear pattern structures (which also prevents us from creating the deepextend() function) and that we should keep it a list as is.

I'll start implementing it. Thanks for your feedback.

mg979 · 2019-11-27T11:10:23Z

Maybe I was too harsh about nested dicts, also structures with short names like this would be fine

let s:template = {
\  'primary': { 'match': ..., 'groups': ...},
\  'parameters': { 'match': ..., 'groups': ..., 'format': ...},
\  'insert': ...,
\  'template': ...,
\}

But there's no need for that 'comment' nested dict. In scripts repeated patterns can also be stored in script variables if you want to avoid repetitions but also deepextend helpers and such.

kkoomen · 2019-11-27T12:50:23Z

Maybe I was too harsh about nested dicts, [...]

No worries. It can be nested and I will adjust the structure partially, yes, since some properties do not need to be nested. The only ones I will keep nested will be match, parameters and generator (the generator key is for now only used in C and C++).

Something else I thought about today is that: I don't want people to copy-paste patterns, then I have to do a bug fix later and then they're stuck with the old regex. I don't want people to manage the regex patterns. I will document this part as well, but my goal is that every ftpugin/<filetype>.vim will have it's own buffer variables or something like that and then if people want to extend them, they can do this. The bad part about this solution is that ~/.vim/ftplugin/<filetype>.vim will load earlier than the vim-doge/ftplugin/<filetype>.vim and I have no idea what a better solution is. Do you have any suggestions?

mg979 · 2019-11-27T14:04:53Z

First: one can use the after/ftplugin directory that is loaded after plugins
are sourced. You can also use the after directory. But this would mean
leaving things to chance, and I wouldn't do that.

I would do it like this, to ensure that all doc standards (stock and custom)
play nice with each other.

Assuming you have a single buffer variable b:doge_doc_standards that does the
job of both b:doge_patterns and b:doge_supported_doc_standards, and the
supported doc standards are the keys of b:doge_doc_standards (now
a dictionary).

In each ftplugin, yours or custom:

" pick up what has been defined already
let b:doge_doc_standards = get(b:, 'doge_doc_standards', {})

" only set the active doc standard if not done yet
let b:doge_doc_standard  = get(b:, 'doge_doc_standard', default)

" define the doc standards in a temporary script variable
let s:doc_standards = {
      \'reST' : ...,
      \'numpy': ...
      \}

" loop defined doc standards and populate the buffer variable, only if they
" have not been previously defined
for doc in keys(s:doc_standards)
  if !has_key(b:doge_doc_standards, doc)
    let b:doge_doc_standards[doc] = s:doc_standards[doc]
  endif
endfor

If there is only one doc standard to define it's easier because you can just
check if it already exists, and terminate the script, you don't need temporary
variables.

In this way you ensure that your ftplugins don't overwrite a custom ftplugin.
A custom ftplugin will probably want to overwrite an existing doc standard, but
it's not doge's job.

Then, you get the doc standards when running DogeGenerate. In the function
doge#generate you check if a valid doc standard is being used, otherwise you
stop.

kkoomen · 2019-11-27T14:25:47Z

First: one can use the after/ftplugin directory that is loaded after plugins
are sourced. You can also use the after directory. But this would mean
leaving things to chance, and I wouldn't do that.

Ah, forgot about the after ;) that'd work indeed!

If there is only one doc standard to define it's easier because you can just
check if it already exists, and terminate the script, you don't need temporary
variables.

In this way you ensure that your ftplugins don't overwrite a custom ftplugin.
A custom ftplugin will probably want to overwrite an existing doc standard, but
it's not doge's job.

Then, you get the doc standards when running DogeGenerate. In the function
doge#generate you check if a valid doc standard is being used, otherwise you
stop.

Agree. I'll take this into consideration when rewriting, thanks!

kkoomen · 2019-11-28T13:20:22Z

I don't want people to copy-paste patterns, then I have to do a bug fix later and then they're stuck with the old regex. I don't want people to manage the regex patterns. I will document this part as well, but my goal is that every ftpugin/.vim will have it's own buffer variables or something like that and then if people want to extend them, they can do this.

@mg979 What about this part? Because my suggestion is for the user that they can extend patterns via the after/ftplugin/<filetype>.vim and then to prevent them from duplicating patterns, they can use buffer variables. This is my lua example locally at this moment (the buffer-local variables are an example, in #65 it still uses script-local variables):

" ==============================================================================
"
" Define our base for every pattern.
"
" ==============================================================================
let b:pattern_base = {
\  'parameters': {
\    'match': '\m\([^,]\+\)',
\    'match_group_names': ['name'],
\    'format': '@param {name} !description',
\  },
\  'insert': 'above',
\}

" ==============================================================================
"
" Define the pattern types.
"
" ==============================================================================

" ------------------------------------------------------------------------------
" Matches regular function expressions and class methods.
" ------------------------------------------------------------------------------
" function new_function(p1, p2, p3, p4)
" local function new_function(p1, p2, p3)
" function BotDetectionHandler:access(p1, p2, p3)
" function a.b:c (p1, p2) body end
" a.b.c = function (self, p1, p2) body end
" ------------------------------------------------------------------------------
let b:function_and_class_method_pattern = doge#helpers#deepextend(b:pattern_base, {
\  'match': '\m^\%(local\s*\)\?function\s*\%([[:alnum:]_:.]\+[:.]\)\?\%([[:alnum:]_]\+\)\s*(\(.\{-}\))',
\  'match_group_names': ['parameters'],
\})

" ------------------------------------------------------------------------------
" Matches regular function expressions as a variable value.
" ------------------------------------------------------------------------------
" myprint = function(p1, p2)
" local myprint = function(p1, p2, p3, p4, p5)
" ------------------------------------------------------------------------------
let b:variable_function_pattern = doge#helpers#deepextend(b:pattern_base, {
\  'match': '\m^\%(local\s*\)\?\%([[:alnum:]_:.]\+[:.]\)\?\([[:alnum:]_]\+\)\s*=\s*\%(\s*function\s*\)\?(\(.\{-}\))',
\  'match_group_names': ['funcName', 'parameters'],
\})

" ==============================================================================
"
" Define the doc standards.
"
"==============================================================================
let b:doge_patterns.ldoc = [
\  doge#helpers#deepextend(b:function_and_class_method_pattern, {
\    'template': [
\      '-- !summary',
\      '-- !description',
\      '%(parameters|-- {parameters})%',
\    ],
\  }),
\  doge#helpers#deepextend(b:variable_function_pattern, {
\    'template': [
\      '-- !summary',
\      '-- !description',
\      '%(parameters|-- {parameters})%',
\    ],
\  }),
\]

The structure is as follows:

Base pattern
Pattern types
Define the doc standards with the pattern types

This is a setup that works for multiple doc standards and multiple pattern types (for example: C++ has structs, functions and fields as pattern types) and prevent a lot of duplicate code.

If the user want to create a custom doc standard they can do the following:

" ~/.vim/after/ftplugin/lua.vim

let b:doge_supported_doc_standards = ['test']
let b:doge_doc_standard = 'test'

let b:doge_patterns = doge#buffer#get_patterns()

let b:doge_patterns.ldoc = [
\  doge#helpers#deepextend(b:function_and_class_method_pattern, {
\    'template': [
\      '-- I want every function to have this summary', " <-- added this
\      '-- !summary',
\      '-- !description',
\      '%(parameters|-- {parameters})%',
\    ],
\  }),
\  doge#helpers#deepextend(b:variable_function_pattern, {
\    'template': [
\      '-- I want every function to have this summary', " <-- added this
\      '-- !summary',
\      '-- !description',
\      '%(parameters|-- {parameters})%',
\    ],
\  }),
\]

The thing I also doubt about is whether to add doge prefix for every buffer-local variable. For example: b:variable_function_pattern becomes b:doge_variable_function_pattern. Just to prevent conflicts with other plugins in any case. What do you suggest?

I think this is a good rewrite that solves #60 and also removes redundant code. Any feedback?

kkoomen · 2019-11-28T13:51:15Z

Progress can be tracked via #65. I'm done with all the languages and updates all tests. I also suggest we continue the discussion at #65.

mg979 · 2019-11-28T17:27:02Z

Just one thing here: it's a very bad idea to use multiple buffer variables, especially with those generic names. Totally unneeded, in the case that somebody wants to change only a regex pattern, without writing a whole doc standard, he doesn't even need to open the source, he just needs to do

:let @" = b:doge_patterns[doc_standard].match

and paste and change it in his own after/ftplugin file, like

" don't throw errors if doge isn't loaded
silent! let b:doge_patterns[doc_standard].match = my_new_match

Just use script variables for templates, don't expect somebody may want to use them.
The very important thing for this to work is that doge doesn't overwrite already defined (and maybe user-modified) doc standards, define the doc standards once and then let the command use whatever it finds, user-modified (or even user-defined) or not.

Note that custom docstandard in ftplugins already work up to a certain measure (I have my vimscript patterns modeled on the script lua), the main thing here is to make the process safe.

The only needed buffer variable is b:doge_patterns I think.

Also, why don't you use a shorter name like groups instead of match_group_names?

kkoomen · 2019-11-29T13:57:19Z

Just one thing here: it's a very bad idea to use multiple buffer variables, especially with those generic names. Totally unneeded, in the case that somebody wants to change only a regex pattern, without writing a whole doc standard, he doesn't even need to open the source, he just needs to do
:let @" = b:doge_patterns[doc_standard].match

This won't work, because b:doge_patterns[doc_standard] is an array, so you have to do: :let @" = b:doge_patterns[doc_standard][0].match

The very important thing for this to work is that doge doesn't overwrite already defined (and maybe user-modified) doc standards, define the doc standards once and then let the command use whatever it finds, user-modified (or even user-defined) or not.

This is working now. The plugin will persist existing b:doge_patterns and b:doge_doc_standard and will only add the built-in patterns if the user did not already define them.

Also, why don't you use a shorter name like groups instead of match_group_names?

This was the name I came up with when I created it. The match key has groups defined in it and you're giving the groups a name. That's how I developed the name. It does what it says it does, but programmatically I'm calling them tokens (some folks call them placeholders). Since I'm rewriting anyway, I might want to rename it to tokens because that's what it is.

The idea of doge is: match a pattern -> extract the data -> create tokens from the data -> use the tokens in a template.

kkoomen · 2019-11-29T14:07:48Z

@mg979 Do you think the current implementation of #65 resolves #60? Because how do you think a user is able to create a custom doc standard without writing custom patterns?

Allow different insertion points for each standard

b818088

pattern['comment']['insert'] can be defined as a dictionary, so that each b:doge_doc_standard can behave differently in this regard

mg979 closed this Nov 26, 2019

mg979 deleted the insert_point branch November 26, 2019 10:23

kkoomen mentioned this pull request Nov 28, 2019

Rewrite ftplugin structure #65

Merged

22 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow different insertion points for each standard #52

Allow different insertion points for each standard #52

Allow different insertion points for each standard #52

Allow different insertion points for each standard #52

Conversation

Prelude

Why this PR?