Windows中的Simstring(python)安装

观众

我试图通过https://github.com/Georgetown-IR-Lab/simstring在Windows中安装simstring python包装器对于Linux,它工作正常,但对于Windows,它在安装时给我错误。

    D:\Users\source\repos>python setup.py install
    running install
    running build
    running build_py
    running build_ext
    building '_simstring' extension
    C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I. -IC:\ProgramData\Anaconda3\include -IC:\ProgramData\Anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" /EHsc /Tpexport.cpp /Fobuild\temp.win-amd64-3.6\Release\export.obj
    export.cpp
    export.cpp(7): fatal error C1083: Cannot open include file: 'iconv.h': No such file or directory
    error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.12.25827\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2

之后,我在项目中包含了iconv.h。但是现在它显示了不同的错误。

running install
running build
running build_py
running build_ext
building '_simstring' extension
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I. -IC:\ProgramData\Anaconda3\include -IC:\ProgramData\Anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" /EHsc /Tpexport.cpp /Fobuild\temp.win-amd64-3.6\Release\export.obj
export.cpp
d:\users\aki\source\repos\simstring\cdbpp.h(101): warning C4267: 'initializing': conversion from 'size_t' to 'uint32_t', possible loss of data
export.cpp(37): error C2664: 'size_t libiconv(libiconv_t,const char **,size_t *,char **,size_t *)': cannot convert argument 2 from 'char **' to 'const char **'
export.cpp(37): note: Conversion loses qualifiers
export.cpp(140): note: see reference to function template instantiation 'bool iconv_convert<std::string,std::wstring>(libiconv_t,const source_type &,destination_type &)' being compiled
        with
        [
            source_type=std::string,
            destination_type=std::wstring
        ]
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.12.25827\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2

任何帮助或指导表示赞赏。

克里斯蒂法蒂

地面注意事项

  • 我设法进行了构建过程,但一度陷入困境。在Visual Studio中创建了[SO] :(基于char的)STL(流)容器的编译错误(我花了很多时间在该问题上)。我以某种方式工作了,但是在尝试构建simstring时出现了其他(类似?)错误,因此我不得不剥离一些(基于Nix的)代码(未编译)。

  • simstringC ++编写生成C ++C)代码时,结果为PE可移植可执行文件.exe.dll)。检查CLR Windows窗体中的[SO]:LNK2005错误(@CristiFati的答案),以获取有关如何转换代码的更多详细信息。处理依赖于(加载).dll.exe时,存在某些限制:

    • .EXE(在这种情况下python.exe)的架构(3264位或(86 64(或AMD64)))必须的任何一个相匹配的.dll,它的负载(和其他的.dll一个已加载的.dll加载等),因此依赖项树中的所有dll都将被加载,否则该.dll将不会加载

    • 在某些情况下,平台(Debug vs. Release)应该匹配。如果不这样做,可能会发生以下情况:[SO]:在库中使用fstream时,可执行文件中出现链接器错误(@CristiFati的回答),但我不认为我们处于这种情况

    • 在某些(其他)情况下,构建工具也应匹配。例子:
    • simstring取决于libiconv,后者也以.dll出现(实际上还有更多,但我们只关心其中一个)。使用Dependency Walker检查.dll(请参阅下文)后发现它是x86 2这意味着:
      • 应该使用Python 32位(x86)。这是我要使用的变体。12开始,计算机上唯一可用的版本是Python 3.6 x86我选择的版本是Python 3.5,我也有32bit格式,但是我搞砸了,没有重新安装)
      • 从源代码构建libiconv,摆脱限制2但是,这可能会花费一些时间,这超出了当前问题的范围。如果有关于构建它的问题,我会花一些时间尝试一下,因为我喜欢这种任务([SO]:如何构建libjpeg 9b的DLL版本?(@ CristiFati的回答)

演练

  • 创建一个目录并cd到它(应该为空)。这将是%ROOT_DIR%,我要使用的所有路径都将是相对路径(当然,绝对路径除外),并且这将是默认目录(未指定时)
  • 下载simstring来源([GitHub]:Georgetown-IR-Lab / simstring-simstring-master.zip
  • 解压缩档案-它会在dir simstring-master中进行处理(将自动创建)
  • 创建一个目录libiconv在其中下载:
    1. [SourceForge]:gnuwin32 / GnuWin-libiconv-1.9.2-1-lib.zip
    2. [SourceForge]:gnuwin32 / GnuWin-libiconv-1.9.2-1-bin.zip
    3. 从这些文件中提取所需的内容:
      • #1开始。
        • include dir-在编译阶段使用
        • lib dir-在链接阶段使用
        • 这两个阶段都由setup.py执行(如下)
      • #2开始。
        • bin目录-在运行时使用(使用(导入)模块时)
  • cdsimstring-master目录。要构建扩展,我使用setup.pybuild_ext命令(通过安装递归调用-如您的输出所示):[Python 3]:distutils.command.build_ext-在包中构建任何扩展
  • 运行build_ext,将产生错误:

    export.cpp(7): fatal error C1083: Cannot open include file: 'iconv.h': No such file or directory
    

    那是因为Python构建系统不知道我们做了什么(在libiconv目录中)。让它知道,传递:

    1. -I-- include -dirs)-将被翻译为[MS.Docs]:/ I(其他包含目录)
    2. -L-- library -dirs)-将被翻译为[MS.Docs]:/ LIBPATH(附加Libpath)
    3. -l --libraries) -将被转换为[MS.Docs]:LINK输入文件


    标记(python setup.py build_ext --help将显示所有标记)。现在,不要通过#2。#3。因为我们不会进入链接阶段(需要它们):

    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>"e:\Work\Dev\VEnvs\py36x86_test\Scripts\python.exe" setup.py build_ext -I"../libiconv/include"
    running build_ext
    building '_simstring' extension
    C:\Install\x86\Microsoft\Visual Studio Community\2015\VC\BIN\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I. -I../libiconv/include -Ic:\Install\x86\Python\Python\3.6\include -Ic:\Install\x86\Python\Python\3.6\include "-IC:\Install\x86\Microsoft\Visual Studio Community\2015\VC\INCLUDE" "-IC:\Install\x86\Microsoft\Visual Studio Community\2015\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" /EHsc /Tpexport.cpp /Fobuild\temp.win32-3.6\Release\export.obj
    export.cpp
    export.cpp(112): warning C4297: 'writer::~writer': function assumed not to throw an exception but does
    export.cpp(112): note: destructor or deallocator has a (possibly implicit) non-throwing exception specification
    export.cpp(126): warning C4297: 'writer::~writer': function assumed not to throw an exception but does
    export.cpp(126): note: destructor or deallocator has a (possibly implicit) non-throwing exception specification
    export.cpp(37): error C2664: 'size_t libiconv(libiconv_t,const char **,size_t *,char **,size_t *)': cannot convert argument 2 from 'char **' to 'const char **'
    export.cpp(37): note: Conversion loses qualifiers
    export.cpp(140): note: see reference to function template instantiation 'bool iconv_convert<std::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::wstring>(libiconv_t,const source_type &,destination_type &)' being compiled
    with
    [
        source_type=std::basic_string<char,std::char_traits<char>,std::allocator<char>>,
        destination_type=std::wstring
    ]
    error: command 'C:\\Install\\x86\\Microsoft\\Visual Studio Community\\2015\\VC\\BIN\\cl.exe' failed with exit status 2
    
  • 要做的事情(发现错误是一个接一个地纠正,只需要export.cpp进行更改):

    1. #define ICONV_CONST constcl.exe不会自动强制转换常量
    2. #define __SIZEOF_WCHAR_T__ 2(如sizeof(wchar_t)2
    3. 删除不编译的代码(我在开始时就谈到过):具有4个字节charSTL容器不在Win上编译,想要修复该代码,并且当Win将支持此类char时,该代码可以编译OOTB,但是我不能,所以我必须做OSX所做的任何事情因此,应替换为(5次) #ifdef __APPLE__#if defined(__APPLE__) || defined(WIN32)


    注意#1。和#2。可以(应该)通过cmdline(-D标志,但我无法为已定义标志指定值)或在setup.py中完成(因此即使需要在它们中声明它们也只能定义一次)很多文件),但是我并没有花太多时间,所以我直接在源代码中替换了它们。


    手动应用更改,或者保存:

    --- export.cpp.orig 2016-11-30 18:53:32.000000000 +0200
    +++ export.cpp  2018-02-14 13:36:31.317953200 +0200
    @@ -19,9 +19,18 @@
     #endif/*USE_LIBICONV_GNU*/
    
     #ifndef ICONV_CONST
    +#if defined (WIN32)
    +#define ICONV_CONST const
    +#else
     #define ICONV_CONST
    +#endif
     #endif/*ICONV_CONST*/
    
    +#if defined (WIN32)
    +#define __SIZEOF_WCHAR_T__ 2
    +#endif
    +
    +
     template <class source_type, class destination_type>
     bool iconv_convert(iconv_t cd, const source_type& src, destination_type& dst)
     {
    @@ -269,7 +278,7 @@
         iconv_close(bwd);
     }
    
    -#ifdef __APPLE__
    +#if defined(__APPLE__) || defined(WIN32)
     #include <cassert>
     #endif
    
    @@ -283,7 +292,7 @@
             retrieve_thru(dbr, query, this->measure, this->threshold, std::back_inserter(ret));
             break;
         case 2:
    -#ifdef __APPLE__
    +#if defined(__APPLE__) || defined(WIN32)
     #if __SIZEOF_WCHAR_T__ == 2
             retrieve_iconv<wchar_t>(dbr, query, UTF16, this->measure, this->threshold, std::back_inserter(ret));
     #else
    @@ -294,7 +303,7 @@
     #endif
             break;
         case 4:
    -#ifdef __APPLE__
    +#if defined(__APPLE__) || defined(WIN32)
     #if __SIZEOF_WCHAR_T__ == 4
             retrieve_iconv<wchar_t>(dbr, query, UTF32, this->measure, this->threshold, std::back_inserter(ret));
     #else
    @@ -317,7 +326,7 @@
             std::string qstr = query;
             return dbr.check(qstr, translate_measure(this->measure), this->threshold);
         } else if (dbr.char_size() == 2) {
    -#ifdef __APPLE__
    +#if defined(__APPLE__) || defined(WIN32)
     #if __SIZEOF_WCHAR_T__ == 2
             std::basic_string<wchar_t> qstr;
     #else
    @@ -333,7 +342,7 @@
             iconv_close(fwd);
             return dbr.check(qstr, translate_measure(this->measure), this->threshold);
         } else if (dbr.char_size() == 4) {
    -#ifdef __APPLE__
    +#if defined(__APPLE__) || defined(WIN32)
     #if __SIZEOF_WCHAR_T__ == 4
             std::basic_string<wchar_t> qstr;
     #else
    

    作为simstring_win.diff那是一个差异请参见[SO]:从鼠标右键单击PyCharm Community Edition中的上下文菜单来运行/调试Django应用程序的UnitTests?(@CristiFati的答案)修补utrunner部分),介绍如何在Win上应用修补程序(基本上,每行以一个“ +”号开头的行都进入,而每行以一个“-”开头的行都熄灭)。我正在使用Cygwin顺便说一句
    我还将此修补程序提交到[GitHub]:Georgetown-IR-Lab / simstring-Win的支持,并且今天已合并180222)。

    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>"c:\Install\x64\Cygwin\Cygwin\AllVers\bin\patch.exe" -i "../simstring_win.diff"
    patching file export.cpp
    
    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>rem Looking at export.cpp content, you'll notice the changes
    
    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>"e:\Work\Dev\VEnvs\py36x86_test\Scripts\python.exe" setup.py build_ext  -I"../libiconv/include" -L"../libiconv/lib" -llibiconv
    running build_ext
    building '_simstring' extension
    C:\Install\x86\Microsoft\Visual Studio Community\2015\VC\BIN\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I. -I../libiconv/include -Ic:\Install\x86\Python\Python\3.6\include -Ic:\Install\x86\Python\Python\3.6\include "-IC:\Install\x86\Microsoft\Visual Studio Community\2015\VC\INCLUDE" "-IC:\Install\x86\Microsoft\Visual Studio Community\2015\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" /EHsc /Tpexport.cpp /Fobuild\temp.win32-3.6\Release\export.obj
    export.cpp
    export.cpp(121): warning C4297: 'writer::~writer': function assumed not to throw an exception but does
    export.cpp(121): note: destructor or deallocator has a (possibly implicit) non-throwing exception specification
    export.cpp(135): warning C4297: 'writer::~writer': function assumed not to throw an exception but does
    export.cpp(135): note: destructor or deallocator has a (possibly implicit) non-throwing exception specification
    C:\Install\x86\Microsoft\Visual Studio Community\2015\VC\BIN\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I. -I../libiconv/include -Ic:\Install\x86\Python\Python\3.6\include -Ic:\Install\x86\Python\Python\3.6\include "-IC:\Install\x86\Microsoft\Visual Studio Community\2015\VC\INCLUDE" "-IC:\Install\x86\Microsoft\Visual Studio Community\2015\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" /EHsc /Tpexport_wrap.cpp /Fobuild\temp.win32-3.6\Release\export_wrap.obj
    export_wrap.cpp
    C:\Install\x86\Microsoft\Visual Studio Community\2015\VC\BIN\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:c:\Install\x86\Python\Python\3.6\Libs /LIBPATH:../libiconv/lib /LIBPATH:e:\Work\Dev\VEnvs\py36x86_test\libs /LIBPATH:e:\Work\Dev\VEnvs\py36x86_test\PCbuild\win32 "/LIBPATH:C:\Install\x86\Microsoft\Visual Studio Community\2015\VC\LIB" "/LIBPATH:C:\Install\x86\Microsoft\Visual Studio Community\2015\VC\ATLMFC\LIB" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.16299.0\ucrt\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\lib\um\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.16299.0\um\x86" libiconv.lib /EXPORT:PyInit__simstring build\temp.win32-3.6\Release\export.obj build\temp.win32-3.6\Release\export_wrap.obj /OUT:build\lib.win32-3.6\_simstring.cp36-win32.pyd /IMPLIB:build\temp.win32-3.6\Release\_simstring.cp36-win32.lib
       Creating library build\temp.win32-3.6\Release\_simstring.cp36-win32.lib and object build\temp.win32-3.6\Release\_simstring.cp36-win32.exp
    Generating code
    Finished generating code
    
    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>dir /b "build\lib.win32-3.6"
    _simstring.cp36-win32.pyd
    
  • 终于,它建成了。.pyd只是一个.dll文件这是在Dependency Walker中的样子

    _simstring.pyd

  • 让我们尝试看看是否可以使用它:

    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>"e:\Work\Dev\VEnvs\py36x86_test\Scripts\python.exe" sample.py
    Traceback (most recent call last):
      File "E:\Work\Dev\StackOverflow\q048528041\simstring-master\simstring.py", line 18, in swig_import_helper
        fp, pathname, description = imp.find_module('_simstring', [dirname(__file__)])
      File "e:\Work\Dev\VEnvs\py36x86_test\lib\imp.py", line 296, in find_module
        raise ImportError(_ERR_MSG.format(name), name=name)
    ImportError: No module named '_simstring'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "sample.py", line 3, in <module>
        import simstring
      File "E:\Work\Dev\StackOverflow\q048528041\simstring-master\simstring.py", line 28, in <module>
        _simstring = swig_import_helper()
      File "E:\Work\Dev\StackOverflow\q048528041\simstring-master\simstring.py", line 20, in swig_import_helper
        import _simstring
    ModuleNotFoundError: No module named '_simstring'
    

    那是因为当导入simstring时,反过来又导入_simstring.pyd),Python找不到它。要解决此问题:

    • .pyd路径添加%PYTHONPATH%
    • 如图所示,.pyd依赖libiconv2.dll,因此操作系统必须知道在哪里寻找它。最简单的方法是将其路径添加到%PATH%[MS.Docs]:动态链接库搜索顺序
    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>set PYTHONPATH=%PYTHONPATH%;build\lib.win32-3.6
    
    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>set PATH=%PATH%;..\libiconv\bin
    
    (py36x86_test) E:\Work\Dev\StackOverflow\q048528041\simstring-master>"e:\Work\Dev\VEnvs\py36x86_test\Scripts\python.exe" sample.py
    ('Barack Hussein Obama II',)
    ('James Gordon Brown',)
    ()
    ('Barack Hussein Obama II',)
    

最后说明

  • 该模块有一些输出,它与LnxUbtu上的输出相同(我也在其中构建了它-在那里我没有遇到任何问题),我不确定它在语义上是否正确
  • 我没有运行setup.pyinstall命令(并且我不会),我可以想到的一件事可能是出错了(尽管我不确定这会出错),没有复制/包括libiconv2。 dll放入pkg。如果是这样,您可能需要修改setup.py(更改应该很小)

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章